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HEPATITIS C VIRUS PROTEASE INHIBITORS 

Cross-Reference to Related Application 
10 This application is a continuation-in-part application of U.S. Serial No. 

07/505,434, filed 4 April 1990. 

Technical Field 

This invention relates to the molecular biology and virology of the 
15 hepatitis C virus (HCV). More specifically, this invention relates to a novel protease 
produced by HCV, methods of expression, recombinant protease, protease mutants, 
and inhibitors of HCV protease. 

Background of the Invention 

20 Non-A, Non-B hepatitis (NANBH) is a transmissible disease (or family 

of diseases) that is believed to be virally induced, and is distinguishable from other 
forms of virus-associated liver disease, such as those caused by hepatitis A virus 
(HAV), hepatitis B virus (HBV), delta hepatitis virus (HDV), cytomegalovirus (CMV) 
or Epstein-Barr virus (EBV). Epidemiologic evidence suggests that there may be three 

25 types of NANBH: the water-borne epidemic type; the blood or needle associated 
type; and the sporadically occurring (community acquired) type. However, the 
number of causative agents is unknown. Recently, however, a new viral species, 
hepatitis C virus (HCV) has been identified as the primary (if not only) cause of 
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blood-associated NANBH (BB-NANBH). See for example, PCT WO89/046699; U.S. 
Patent Application Serial No. 7/456,637, filed 21 December 1989; and U.S. Patent 
Application Serial No. 7/456,637, filed 21 December 1989, incorporated herein by 
reference. Hepatitis C appears to be the major form of transfusion-associated hepatitis 
5 in a number of countries, including the United States and Japan. There is also 
evidence implicating HCV in induction of hepatocellular carcinoma* Thus, a need 
exists for an effective method for treating HCV infection: currently, there is none. 

Many viruses, including adenoviruses, baculoviruses, comoviruses, 
picomaviruses, retroviruses, and togaviruses, rely on specific, virally-encoded proteases 
10 for processing polypeptides from their initial translated form into mature, active 
proteins. In the case of picomaviruses, all of the viral proteins are believed to arise 
from cleavage of a single polyprotein (BJ5. Korant, CRC Crit Rev Biotech (1988) 
1:149-57). 

S. Pichuantes et al, in "Viral Proteinases As Targets For Chemotherapy" 
15 (Cold Spring Harbor Laboratory Press, 1989) pp. 215-22, disclosed expression of a 
viral protease found in HIV-1. The HTV protease was obtained in the form of a fusion 
protein, by fusing DNA encoding an HIV protease precursor to DNA encoding human 
superoxide dismutase (hSOD), and expressing the product in E. colu Transformed 
cells expressed products of 36 and 10 kDa (corresponding to the hSOD-protease fusion 
20 protein and the protease alone), suggesting that the protease was expressed in a form 
capable of autocatalytic proteolysis. 

TJ. McQuade et al, Science (1990) 247:454-56 disclosed preparation of 
a peptide mimic capable of specifically inhibiting the HIV-1 protease- In HIV, the 
protease is believed responsible for cleavage of the initial p55 gag precursor transcript 
25 into the core structural proteins (pl7, p24, p8, and p7). Adding 1 pM inhibitor to 
HIV-infected peripheral blood lymphocytes in culture reduced the concentration of 
processed HIV p24 by about 70%. Viral maturation and levels of infectious virus 
were reduced by the protease inhibitor. 
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Disclosure of the Invention 

We have now invented recombinant HCV protease, HCV protease 
fusion proteins, truncated and altered HCV proteases, cloning and expression vectors 
therefore, and methods for identifying antiviral agents effective for treating HCV. 

5 

Brief Description of the Drawings 

Figure 1 shows the sequence of HCV protease. 

Figure 2 shows the polynucleotide sequence and deduced amino acid 
sequence of the clone C20c. 
10 Figure 3 shows the polynucleotide sequence and deduced amino acid 

sequence of the clone C26& 

Figure 4 shows the polynucleotide sequence and deduced amino acid 
sequence of the clone C8h. 

Figure 5 shows the polynucleotide sequence and deduced amino acid 
15 sequence of the clone C7f. 

Figure 6 shows the polynucleotide sequence and deduced amino acid 
sequence of the clone C3L 

Figure 7 shows the polynucleotide sequence and deduced amino acid 
sequence of the clone C35. 
20 Figure 8 shows the polynucleotide sequence and deduced amino acid 

sequence of the clone C33c. 

Figure 9 schematically illustrates assembly of the vector 
C7fC20cC300C200. 

Figure 10 shows the sequence for cflSODp600. 
25 Modes of Carrying Out The Invention 

A. Definitions 
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The terms 'Hepatitis C Vims" and ,r HCV" refer to the viral species that 

is the major etiological agent of BB-NANBH, the prototype isolate of which is 

identified in PCT WO89/046699; EPO publication 318,216; USSN 7/355,008, filed 

18 May 1989; and USSN 7/456,637, the disclosures of which are incorporated herein 

5 by reference. ,V HCV" as used herein includes the pathogenic strains capable of causing 

hepatitis C, and attenuated strains or defective interfering particles derived therefrom. 

The HCV genome is comprised of RNA. It is known that RNA-containing viruses 

-3 

have relatively high rates of spontaneous mutation, reportedly on the order of 10 to 
10" 4 per incorporated nucleotide (Fields & Knipe, 1 'Fundamental Virology" (1986, 

10 Raven Press, RY*))« As heterogeneity and fluidity of genotype are inherent character- 
istics of RNA viruses, there will be multiple strains/isolates, which may be virulent or 
avirulent, within the HCV species. 

Information on several different strains/isolates of HCV is disclosed 
herein, particularly strain or isolate CDC/HCVI (also called HCV1). Information from 

IS one strain or isolate, such as a partial genomic sequence, is sufficient to allow those 
skilled in the art using standard techniques to isolate new strains/isolates and to 
identify whether such new strains/isolates are HCV. For example, several different 
strains/isolates are described below. These strains, which were obtained from a 
number of human sera (and from different geographical areas), were isolated utilizing 

20 the information from the genomic sequence of HCV1. 

The information provided herein suggests that HCV may be distantly 
related to the fiaviviridae. The Flavivirus family contains a large number of viruses 
which are small, enveloped pathogens of man. The morphology and composition of 
Flavivirus particles are known, and are discussed in M.A. Brinton, in 'The Viruses: 

25 The Togaviridae And Fiaviviridae" (Series eds. Fraenkel-Conrat and Wagner, vol. eds. 
Schlesinger and Schlesinger, Plenum Press, 1986), pp. 327-374. Generally, with 
respect to morphology, Flaviviruses contain a central nucleocapsid surrounded by a 
lipid bilayer. Virions are spherical and have a diameter of about 40-50 nm. Their 
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corcs arc about 25-30 nm in diameter. Along the outer surface of the virion envelope 
are projections measuring about 5-10 nm in length with terminal knobs about 2 nm in 
diameter. Typical examples of the family include Yellow Fever virus, West Nile 
virus, and Dengue Fever virus. They possess positive-stranded RNA genomes (about 
5 1 1,000 nucleotides) that are slighdy larger than that of HCV and encode a polyprotein 
precursor of about 3500 amino acids. Individual viral proteins are cleaved from this 

precursor polypeptide. 

The genome of HCV appears to be single-stranded RNA containing 
about 10,000 nucleotides. The genome is positive-stranded, and possesses a 
10 continuous translational open reading frame (ORF) that encodes a polyprotein of about 
3,000 amino acids. In the ORF, the structural proteins appear to be encoded in ap- 
proximately the first quarter of the N-terminal region, with the majority of the 
polyprotein attributed to non-structural proteins. When compared with all known viral 
sequences, small but significant co-linear homologies are observed with the non- 
15 structural proteins of the Flavivirus family, and with the pestiviruses (which are now 
also considered to be part of the Flavivirus family). 

A schematic alignment of possible regions of a flaviviral polyprotein 
(using Yellow Fever Virus as an example), and of a putative polyprotein encoded in 
the major ORF of die HCV genome, is shown in Figure 1. Possible domains of the 
20 HCV polyprotein are indicated in the figure. The Yellow Fever Virus polyprotein 
contains, from the amino terminus to the carboxy terminus, the nucleocapsid protein 
(C), the matrix protein (M), the envelope protein (E), and the non-structural proteins 1, 
2 (a+b), 3, 4 (a+b), and 5 (NS1, NS2, NS3, NS4, and NS5). Based upon the putative 
amino acids encoded in the nucleotide sequence of HCV1, a small domain at the 
25 extreme N-terminus of the HCV polyprotein appears similar both in size and high con- 
tent of basic residues to the nucleocapsid protein (C) found at the N-terminus of flavi- 
viral polyproteins. The non-structural proteins 2,3,4, and 5 (NS2-5) of HCV and of 
yellow fever virus (YFV) appear to have counterparts of similar size and hydropath- 
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icity, although the amino add sequences diverge. However, the region of HCV which 
would correspond to the regions of YFV polyprotein which contains the M, E, and 
NS1 protein not only differs in sequence, but also appears to be quite different in size 
and hydropathicity. Thus, while certain domains of the HCV genome may be referred 
5 to herein as, for example, NS1, or NS2, it should be understood that these designations 
are for convenience of reference only; there may be considerable differences between 
the HCV family and flaviviruses that have yet to be appreciated. 

Due to the evolutionary relationship of the strains or isolates of HCV, 
putative HCV strains and isolates are identifiable by their homology at the polypeptide 
10 level With respect to the isolates disclosed herein, new HCV strains or isolates are 
expected to be at least about 40% homologous, some more than about 70% 
homologous, and some even more than about 80% homologous: some may be more 
than about 90% homologous at the polypeptide level. The techniques for determining 
amino acid sequence homology are known in the art. For example, the amino acid 
15 sequence may be determined directly and compared to the sequences .provided herein. 
Alternatively the nucleotide sequence of the genomic material of the putative HCV 
may be determined (usually via a cDNA intermediate), the amino acid sequence 
encoded therein can be determined, and the corresponding regions compared. 

The term "HCV protease" refers to an enzyme derived from HCV which 
20 exhibits proteolytic activity, specifically the polypeptide encoded in the NS3 domain of 
the HCV genome. At least one strain of HCV contains a protease believed to be sub- 
stantially encoded by or within the following sequence: 

Aig Arg Gly Arg Glu lie Leu Leu Gly Pro 10 
25 Ala Asp Gly Met Val Ser Lys Gly Trp Arg 20 

Leu Leu Ala Pro He Thr Ala Tyr Ala Gin 30 

Gin Thr Arg Gly Leu Leu Gly Cys He lie 40 

Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin 50 

Val Glu Gly Glu Val Gin lie Val Ser Thr 60 
30 Ala Ala Gin Thr Phe Leu Ala Thr Cys He 70 

Asn Gly Val Cys Trp Thr Val Tyr His Gly 80 
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Ala Gly Thr Arg Thr He Ala Ser Pro Lys 90 
Gly Pro Val He Gin Met Tyr Thr Asn Val 100 
Asp Gin As£ Leu Val Gly Tip Pro Ala Ser 1 10 
Gin Gly Thr Arg Ser Leu Thr Pro Cys Thr 120 
5 Cys Gly Ser Set Asp Leu Tyr Leu Val Thr 130 

Arg His Ala Asp Val lie Pro Val Arg Arg 140 
Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser ISO 
Pro Arg Pro De Ser Tyr Leu Lys Gly Ser 160 
Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly 170 
10 His Ala Val Gly lie Phe Arg Ala Ala Val 180 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp 190 
Phe lie Pro Val Glu Asn Leu Glu Thr Thr 200 
Met Arg ••• 202 

The above N and C termini are putative, the actual termini being 

IS defined by expression and processing in an appropriate host of a DNA construct 
encoding the entire NS3 domain. It is understood that this sequence may vary from 
strain to strain, as RNA viruses like HCV are known to exhibit a great deal of 
variation. Further, the actual N and C termini may vary, as the protease is cleaved 
from a precursor polyprotein: variations in the protease amino acid sequence can 

20 result in cleavage from the polyprotein at different points. Thus, the amino- and 

carboxy-tesmini may differ from strain to strain of HCV. Hie first amino acid shown 
above corresponds to residue 60 in Figure 1. However, the minimum sequence 
necessary for activity can be determined by routine methods. The sequence may be 
truncated at either end by treating an appropriate expression vector with an exonucle- 

25 ase (after cleavage at the 5' or 3' end of the coding sequence) to remove any desired 
number of base pairs. The resulting coding polynucleotide is then expressed and the 
sequence determined. In this manner the activity of the resulting product may be 
correlated with the amino acid sequence: a limited series of such experiments 
(removing progressively greater numbers of base pairs) determines the minimum 

30 internal sequence necessary for protease activity. We have found that the sequence 
may be substantially truncated, particularly at the carboxy terminus, apparently with 
full retention of protease activity. It is presently believed that a portion of the protein 
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at the carboxy terminus may exhibit helicase activity. However, helicase activity is 
not required of the HCV proteases of the invention. The amino terminus may also be 
truncated to a degree without loss of protease activity. 

The amino acids underlined above are believed to be the residues 

5 necessary for catalytic activity, based on sequence homology to putative flavivirus 
serine proteases. Table 1 shows the alignment of the three serine protease catalytic 
residues for HCV protease and the protease obtained from Yellow Fpver Virus, West 
Nile Fever virus, Murray Valley Fever virus, and Kunjin virus. Although the other 
four flavivirus protease sequences exhibit higher homology with each other than with 

10 HCV, a degree of homology is still observed with HCV. This homology* however, 
was not sufficient for indication by currently available alignment software. The 
indicated amino acids are numbered His^, Asp 103 , and Ser 1M in the sequence listed 
above (His 139l Asp 163 , and Sei^ in Figure 1). 

15 TABLE 1: Alignment of Active Residues by Sequence 



20 



Protease 


His 


Asp 


Ser 




HCV 


CWTVYHGAG 


DQDLGWPAP 


LKGSSGGPL 


— 1 


Yellow Fever 


EHTMWHVTR 


KEDLVAYGG 


PSGTSGSPI 




West Nile Fever 


FHTLWHTTK 


KEDRLCYGG 


PTGTSGSPI 




Murray Valley 


FHTLWHTTR 


KEDRVTYGG 


PIGTSGSPI 




Kunjin Virus 


FHTLWHTTK 


KEDRLCYGG 


PTGTSGSPI 


— i 



25 



Alternatively, one can make catalytic residue assignments based on 
structural homology. Table 2 shows alignment of HCV with against the catalytic sites 
of several well-characterized serine proteases based on structural considerations: 
protease A from Streptomyces griseus, a-lytic protease, bovine trypsin, chymotrypsin, 
and elastase (ML James et al, Can J Biochem (1978) 56:396). Again, a degree of 



WO 91/15596 



PCT/US9I/02209 



-9- 



homology is observed. The HCV residues identified are numbered His,,, Asp 12j , and 
Ser 16 | in the sequence listed above. 



TABLE 2: Alignment of Active Residues by Structure 



10 




S. griseus A 
a-Lytic protease 
Bovine Trypsin 
Chymotrypsin 

Elastase 
HCV 



TAGHC 
TAGHC 
SAAHC 
TAAHC 
TAAHC 
TVYHG 



NNDYGH 

ONDRAWV 

NNDIMLI 

NNDTTLL 

GYDIALL 

SSDLYLV 



GDSGGSL 

GDSGGSW 

GDSGGPV 

GDSGGPL 

GDSGGPL 

GSSGGPL 



J 



10 



15 



The most direct manner to verify the residues essential to the active site 
is to replace each residue individually with a residue of equivalent stearic size. This is 
easily accomplished by site-specific mutagenesis and similar methods known in the 
art If replacement of a particular residue with a residue of equivalent size results in 
loss of activity, the essential nature of the replaced residue is confirmed. 

"HCV protease analogs" refer to polypeptides which vary from the full 
length protease sequence by deletion, alteration and/or addition to the amino acid 
sequence of the native protease. HCV protease analogs include the truncated proteases 
described above, as well as HCV protease muteins and fusion proteins comprising 
HCV protease, truncated protease, or protease muteins. Alterations to form HCV pro- 
tease muteins are preferably conservative amino acid substitutions, in which an amino 
acid is replaced with another naturally-occurring amino acid of similar character. For 
example, the following substitutions are considered "conservative": 
Gly <-> Ala; Lys Arg; 

Val <r> He *4 Leu; Asn <-» Gin; and 

Asp o Glu; Phe Trp <-* Tyr. 
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Nonconservative changes are generally substitutions of one of the above amino acids 
with an amino acid from a different group (e.g., substituting Asn for Glu), or 
substituting Cys, Met, His, or Pro for any of the above amino acids. Substitutions 
involving common amino acids are conveniently performed by site specific 

5 mutagenesis of an expression vector encoding the desired protein, and subsequent 
expression of the altered form. One may also alter amino acids by synthetic or semi- 
synthetic methods. For example, one may convert cysteine or serine residues to 
selenocysteine by appropriate chemical treatment of the isolated protein. Alternatively, 
one may incorporate uncommon amino acids in standard in vitro protein synthetic 

10 methods. Typically, the total number of residues changed, deleted or added to the 

native sequence in the muteins will be no more than about 20, preferably no mote than 
about 10, and most preferably no more than about 5. 

The term fusion protein generally refers to a polypeptide comprising an 
amino acid sequence drawn from two or more individual proteins. In the present 

IS invention, "fusion protein" is used to denote a polypeptide comprising the HCV 

protease, truncate, mutein or a functional portion thereof, fused to a non-HCV protein 
or polypeptide ("fusion partner"). Fusion proteins are most conveniently produced by 
expression of a fused gene, which encodes a portion of one polypeptide at the 5' end 
and a portion of a different polypeptide at the 3' aid, where the different portions are 

20 joined in one reading frame which may be expressed in a suitable host It is presently 
preferred (although not required) to position the HCV protease or analog at the car- 
boxy terminus of the fusion protein, and to employ a functional enzyme fragment at 
the amino terminus. As the HCV protease is normally expressed within a large 
polyprotein, it is not expected to include cell transport signals (e.g., export or secretion 

25 signals). Suitable functional enzyme fragments are those polypeptides which exhibit a 
quantifiable activity when expressed fused to the HCV protease. Exemplary enzymes 
include, without limitation, (J-galactosidase (p-gal), P-lactamase, horseradish per- 
oxidase (HRP), glucose oxidase (GO), human superoxide dismutase (hSOD), urease, 
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and the like* These enzymes are convenient because the amount of fusion protein 
produced can be quantified by means of simple colorimetric assays. Alternatively, one 
may employ antigenic proteins or fragments, to permit simple detection and 
quantification of fusion proteins using antibodies specific for the fusion partner. The 
5 presently preferred fusion partner is hSOD. 

B. General Method 

The practice of the present invention generally employs conventional 
techniques of molecular biology, microbiology, recombinant DNA, and immunology, 

10 which are within the skill of the art. Such techniques are explained fully in the 
literature. See for example J. Sambrook et al, "Molecular Cloning; A Laboratory 
Manual (1989); 'DNA Cloning", Vol. I and H (D.N Glover ed. 1985); 
"Oligonucleotide Synthesis" (MJ. Gait ed, 1984); 'Nucleic Acid Hybridization" (B.D. 
Hames & S J. Higgins eds. 1984); "Transcription And Translation" (BX>. Hames & 

15 SJ. Higgins eds. 1984); "Animal Cell Culture" (R J. Rreshney ed. 1986); "Immobil- 
ized Cdis And Enzymes" (IRL Press, 1986); B. Perbal, "A Practical Guide To 
Molecular Cloning" (1984); the series, "Methods In Enzymology" (Academic Press, 
Inc.); "Gene Transfer Vectors For Mammalian Cells" (JJH. Miller and MJP. Calos eds. 
1987, Cold Spring Harbor Laboratory); Meth Enzvmol (1987) 154 and 155 (Wu and 

20 Grossman, and Wu, eds., respectively); Mayer & Walker, eds. (1987), "Immunochem- 
ical Methods In Cell And Molecular Biology" (Academic Press, London); Scopes, 
"Protein Purification: Principles And Practice", 2nd Ed (Springer-Veriag, N.Y., 1987); 
and "Handbook Of Experimental Immunology", volumes I-IV (Weir and Blackwell, 
eds, 1986). 

25 Both prokaryotic and eukaryotic host cells are useful for expressing 

desired coding sequences when appropriate control sequences compatible with the des- 
ignated host are used. Among prokaryotic hosts, E. coli is most frequently used. 
Expression control sequences for prokaryotes include promoters, optionally containing 
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operator portions, and ribosome binding sites. Transfer vectors compatible with 
prokaryotic hosts are commonly derived from, far example, pBR322, a plasmid 
containing operons confening ampicillin and tetracycline resistance, and the various 
pUC vectors, which also contain sequences confening antibiotic resistance markers. 

5 These plasmids are commercially available. The markers may be used to obtain suc- 
cessful transformants by selection. Commonly used prokaryotic control sequences 
include the ^-lactamase (penicillinase) and lactose promoter systems (Chang et al, 
Nature (1977) 198:1056), the tryptophan (trp) promoter system (Goeddel et al, Nuc 
Acids Res (1980) 8:4057) and the lambda-derived P L promoter and N gene ribosome 

10 binding site (Shimatake et al* Nature (1981) 292:128) and the hybrid tec promoter (De 
Boer et al, Proc Nat Acad Sci USA (1983) 292:128) derived from sequences of the £e 
and lac UV5 promoters. The foregoing systems are particularly compatible with E. 
coli\ if desired, other prokaryotic hosts such as strains of Bacillus or Pseudomonas 
may be used, with corresponding control sequences. 

15 Eukaryotic hosts include without limitation yeast and mammalian cells 

in culture systems. Yeast expression hosts include Saccharomyces, Klebsiella, Picia, 
and the like. Saccharomyces cerevisiae and Saccharomyces carlsbergensis and K. 
lactis are the most commonly used yeast hosts, and are convenient fungal hosts. 
Yeast-compatible vectors cany markers which permit selection of successful transfor- 

20 mants by confening p ro to trophy to auxotrophic mutants or resistance to heavy metals 
on wild-type strains. Yeast compatible vectors may employ the 2p origin of 
replication (Broach et al, Meth Enzvmol (1983) 101:307), the combination of CEN3 
and ARS1 or other means for assuring replication, such as sequences which will result 
in incorporation of an appropriate fragment into the host cell genome. Control 

25 sequences for yeast vectors are known in the art and include promoters for the 
synthesis of glycolytic enzymes (Hess et al, J Adv Enzyme Reg (1968) 7:149; 
Holland et al, Biochem (1978), 17:4900), including the promoter for 3-phos- 
phoglycerate kinase (R. Hitzeman et al, J Biol Chem (1980) 255:2073). Terminators 
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may also be included, such as those derived from the enolase gene (Holland, J Biol 
Chem (1981) 256:1385). Particularly useful control systems are those which comprise 
the glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter or alcohol 
dehydrogenase (ADH) regulatable promoter, terminators also derived from GAPDH, 

5 and if secretion is desired, a leader sequence derived from yeast a-factor (see U.S. Pat 
No. 4,870,008, incorporated herein by reference). 

A presently preferred expression system employs the ubiquitin leader as 
the fusion partner. Copending application USSN 7/390,599 filed 7 August 1989 
disclosed vectors for high expression of yeast ubiquitin fusion proteins. Yeast 

10 ubiquitin provides a 76 amino acid polypeptide which is automatically cleaved from 

the fused protein upon expression. The ubiquitin amino acid sequence is as follows: 

Gin De Phe Val Lys Thr Leu Thr Gly Lys Thr He Thr 
Leu Glu Val Glu Ser Ser Asp Thr lie Asp Asn Val Lys 
Ser Lys He Gin Asp Lys Glu Gly De Pro Pro Asp Gin 
15 Gin Arg Leu He Phe Ala Gly Lys Gin Leu Glu Asp Gly 

Arg Thr Leu Ser Asp Tyr Asn lie Gin Lys Glu Ser Thr 
Leu His Leu Val Leu Arg Leu Arg Gly Gly 

See also Ozkaynak et al, Nature (1984) 312:663-66. Polynucleotides 
20 encoding the ubiquitin polypeptide may be synthesized by standard methods, for 

example following the technique of Ban et al, J Biol Chem (1988) 268:1671-78 using 
an Applied Biosystem 380A DNA synthesizer. Using appropriate linkers, the 
ubiquitin gene may be inserted into a suitable vector and ligated to a sequence 
encoding the HCV protease or a fragment thereof. 
25 In addition, the transcriptional regulatory region and the transcriptional 

initiation region which are operably linked may be such that they are not naturally 
associated in the wild-type organism. These systems are described in detail in EPO 
120,551, published October 3, 1984; EPO 116,201, published August 22, 1984; and 
EPO 164,556, published December 18, 1985, all of which are commonly owned with 
30 the present invention, and are hereby incorporated herein by reference in full. 
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Mammalian cell lines available as hosts for expression axe known in the 
art and include many immortalized cell lines available from the American Type 
Culture Collection (ATCQ, including HeLa cells, Chinese hamster ovary (CHO) cells, 
baby hamster kidney (BHK) cells, and a number of other cell lines. Suitable 

5 promoters for mammalian cells are also known in the art and include viral promoters 
such as that from Simian Virus 40 (SV40) (Fiers et al, Nature (1978) 273:113), Rous 
sarcoma virus (RS V), adenovirus (ADV), and bovine papilloma virus (BPV). 
Mammalian cells may also require terminator sequences and poly-A addition 
sequences. Enhancer sequences which increase expression may also be included, and 

10 sequences which promote amplification of the gene may also be desirable (for example 
methotrexate resistance genes). These sequences are known in the art 

Vectors suitable for replication in mammalian cells are known in the art, 
and may include viral replicons, or sequences which insure integration of the 
appropriate sequences encoding HCV epitopes into the host genome. For example, 

15 another vector used to express foreign DNA is Vaccinia virus. In this case the 

heterologous DNA is inserted into the Vaccinia genome. Techniques for the insertion 
of foreign DNA into the vaccinia virus genome are known in the art, and may utilize, 
for example, homologous recombination. The heterologous DNA is generally inserted 
into a gene which is non-essential to the virus, for example, the thymidine kinase gene 

20 (tk), which also provides a selectable marker. Plasmid vectors that greatly facilitate 
the construction of recombinant viruses have been described (see, for example, 
Mackett et al, J Virol (1984) 49:857; Chakrabarti et al, Mol Cell Biol (1985) 5:3403; 
Moss, in GENE TRANSFER VECTORS FOR MAMMALIAN CELLS (Miller and 
Calos, eds., Cold Spring Harbor Laboratory, NY, 1987), p. 10). Expression of the 

25 HCV polypeptide then occurs in cells or animals which are infected with the live 
recombinant vaccinia virus. 

In order to detect whether or not the HCV polypeptide is expressed 
from the vaccinia vector, BSC 1 cells may be infected with the recombinant vector 
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and grown on microscope slides under conditions which allow expression. The cells 
may then be acetone-fixed, and immunofluorescence assays performed using serum 
which is known to contain anti-HCV antibodies to a polypeptide(s) encoded in the 
region of the HCV genome from which the HCV segment in the recombinant 
5 expression vector was derived 

Other systems for expression of eukaryotic or viral genomes include 

insect cells and vectors suitable for use in these cells. These systems are known in the 
art, and include, for example, insect expression transfer vectors derived from the 
baculovirus Autographa califbrnica nuclear polyhedrosis virus (AcNPV), which is a 

10 helper-independent, viral expression vector. Expression vectors derived from this 
system usually use the strong viral polyhedrin gene promoter to drive expression of 
heterologous genes. Currently the most commonly used transfer vector for introducing 
foreign genes into AcNPV is pAc373 (see PCT WO89/046699 and USSN 7/456,637), 
Many other vectors known to those of skill in the art have also been designed for 

15 improved expression. These include, for example, pVL985 (which alters the 

polyhedrin start codon from ATG to ATT, and introduces a BamHI cloning site 32 bp 
downstream from the ATT; See Luckow and Summers. Virol (1989) 17:31). AcNPV 
transfer vectors for high level expression of nonfiised foreign proteins are described in 
copending applications PCT WO89/046699 and USSN 7/456,637. A unique BamHI 

20 site is located following position -8 with respect to the translation initiation codon 
ATG of the polyhedrin gene. There are no cleavage sites for Smal, PstI, BgUI, Xbal 
or Sstl. Good expression of nonfused foreign proteins usually requires foreign genes 
that ideally have a short leader sequence containing suitable translation initiation 
signals preceding an ATG start signal. The plasmid also contains the polyhedrin poly- 

25 adenylation signal and the ampicillin-resistance (amp) gene and origin of replication 
for selection and propagation in E. colu 

Methods for the introduction of heterologous DNA into the desired site 
in the baculovirus virus are known in the art. (See Summer and Smith, Texas 
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Agricultural Experiment Station Bulletin No. 1555; Smith et al, Mol Cell Biol (1983) 
3:2156-2165; and Luckow and Summers, Virol (1989) 17:31). For example, the 
heterologous DNA can be inserted into a gene such as the polyhedrin gene by 
homologous recombination, or into a restriction enzyme site engineered into the 
desired baculovirus gene. The inserted sequences may be those which encode all or 
varying segments of the polyprotein, or other oris which encode viral polypeptides. 
For example, die insert could encode the following numbers of amino acid segments 
from the polyprotein: amino acids 1-1078; amino acids 332-662; amino acids 406-662; 
amino acids 156-328, and amino acids 199-328. 

The signals for post-translational modifications, such as signal peptide 
cleavage, proteolytic cleavage, and phosphorylation, appear to be recognized by insect 
cells. The signals required for secretion and nuclear accumulation also appear to be 
conserved between the invertebrate cells and vertebrate cells. Examples of the signal 
sequences from vertebrate cells which are effective in invertebrate cells are known in 
the art, for example, the human interieukin-2 signal (IL2 S ) which signals for secretion 
from the cell, is recognized and properly removed in insect cells. 

Transformation may be by any known method for introducing 
polynucleotides into a host cell, including, for example packaging the polynucleotide 
in a virus and transducing a host cell with the virus, and by direct uptake of the 
polynucleotide. The transformation procedure used depends upon the host to be 
transformed. Bacterial transformation by direct uptake generally employs treatment 
with calcium or rubidium chloride (Cohen, Proc Nat Acad Sci USA (1972) 69:2110; 
T. Maniatis et al, "Molecular Cloning; A 
Laboratory Manual" (Cold Spring Harbor Press, Cold Spring 
Harbor, NY, 1982). Yeast transformation by direct uptake may be carried out using 
the method of Hinnen et al, Proc Nat Acad Sci USA (1978) 75:1929. Mammalian 
transformations by direct uptake may be conducted using the calcium phosphate 
precipitation method of Graham and Van der Eb, Virol (1978) 52:546, or the various 
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known modifications thereof. Other methods for introducing recombinant 
polynucleotides into cells, particularly into mammalian cells, include dextran-mediated 
transfection, calcium phosphate mediated transfection, polybrene mediated transfection, 
protoplast fusion, electroporation, encapsulation of the pblynucleotide(s) in liposomes, 
5 and direct microinjection of the polynucleotides into nuclei. 

Vector construction employs techniques which are known in the art 
Site-specific DNA cleavage is performed by treating with suitable restriction enzymes 
under conditions which generally are specified by the manufacturer of these 
commercially available enzymes. In general, about 1 pg of plasmid or DNA sequence 

10 is cleaved by 1 unit of enzyme in about 20 pL buffer solution by incubation for 1-2 hr 
at 37°C. After incubation with the restriction enzyme, protein is removed by 
phenol/chloroform extraction and the DNA recovered by precipitation with ethanol. 
The cleaved fragments may be separated using polyacrylamide or agarose gel 
electrophoresis techniques, according to the general procedures described in Meth 

15 Itanol (1980) 65:499-560. 

Sticky-ended cleavage fragments may be blunt ended using E. coli DNA 
polymerase I (Klenow fragment) with the appropriate deoxynucleotide triphosphates 
(dNTPs) present in the mixture. Treatment with SI nuclease may also be used, 
resulting in the hydrolysis of any single stranded DNA portions. 

20 Ligations are carried out under standard buffer and temperature 

conditions using T4 DNA ligase and ATP; sticky end ligations require less ATP and 
less ligase than blunt end ligations. When vector fragments are used as part of a 
ligation mixture, the vector fragment is often treated with bacterial alkaline 
phosphatase (BAP) or calf intestinal alkaline phosphatase to remove the S'-phosphate, 

25 thus preventing religation of the vector. Alternatively, restriction enzyme digestion of 
unwanted fragments can be used to prevent ligation. 
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Ligation mixtures are transformed into suitable cloning hosts, such as E. 
colU and successful transformants selected using the markers incorporated (e.g., 
antibiotic resistance), and screened for the correct construction. 

Synthetic oligonucleotides may be prepared using an automated 

5 ftHgomiclentide synthesizer as described bv Warner, DNA (1984) 3:401. If desired, 
the synthetic strands may be labeled with *P by treatment with polynucleotide kinase 
in the presence of ^P-AIP under standard reaction conditions. 

DNA sequences, including those isolated from cDNA libraries, may be 
modified by known techniques, for example by site directed mutagenesis (see e.g., 

10 Zoller. Nuc Acids Res (1982) 10:6487). Briefly, the DNA to be modified is packaged 
into phage as a single stranded sequence, and converted to a double stranded DNA 
with DNA polymerase, using as a primer a synthetic oligonucleotide complementary to 
the portion of the DNA to be modified, where the desired modification is included in 
the primer sequence. The resulting double stranded DNA is transformed into a phage- 

15 supporting host bacterium. Cultures of the transformed bacteria which contain copies 
of each strand of the phage are plated in agar to obtain plaques. Theoretically, 50% of 
the new plaques contain phage having the mutated sequence, and the remaining 50% 
have the original sequence. Replicates of the plaques are hybridized to labeled 
synthetic probe at temperatures and conditions which permit hybridization with the 

20 correct strand, but not with the unmodified sequence. The sequences which have been 
identified by hybridization are recovered and cloned. 

DNA libraries may be probed using the procedure of Grunstein and 
Hogness Proc Nat Acad Sci USA (1975) 73:3961. Briefly, in this procedure the DNA 
to be probed is immobilized on nitrocellulose filters, denatured, and prehybridized with 

25 a buffer containing 0-50% formamide, 0.75 M NaCl, 75 raM Na citrate, 0.02% (wt/v) 
each of bovine serum albumin, polyvinylpyrrolidone, and Ficoll®, 50 mM NaH2P0 4 
(pH 6.5), 0.1% SDS, and 100 pg/mL carrier denatured DNA. The percentage of 
formamide in the buffer, as well as the time and temperature conditions of the 
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prehybridization and subsequent hybridization steps depend on the stringency required. 
Oligomeric probes which require lower stringency conditions are generally used with 
low percentages of formamide, lower temperatures, and longer hybridization times. 
Probes containing more than 30 or 40 nucleotides, such as those derived from cDNA 
5 or genomic sequences generally employ higher temperatures, e.g., about 40-42°C, and 
a high percentage formamide, e.g„ 50%. Following prehybridization, 5 ^-labeled 
oligonucleotide probe is added to the buffer, and the filters are incubated in this 
mixture under hybridization conditions. After washing, the treated filters are subjected 
to autoradiography to show the location of the hybridized probe; DNA in cor- 
10 responding locations on the original agar plates is used as the source of the desired 
DNA* 

For routine vector constructions, ligation mixtures are transformed into 
E. coli strain HB101 or other suitable hosts, and successful transformants selected by 
antibiotic resistance or other markers. Plasmids from the transformants are then 

15 prepared according to the method of Clewell et al, Proc Nat Acad Sci USA (1969) 
62:1159, usually following chloramphenicol amplification (Clewell, J Bacteriol (1972) 
1 10:667), The DNA is isolated and analyzed, usually by restriction enzyme analysis 
and/or sequencing. Sequencing may be performed by the dideoxy method of Sanger et 
al, Proc Nat Acad Sci USA (1977) 74:5463, as further described by Messing et al, 

20 Nuc Acids Res (1981) 9:309, or by the method of Maxam et al, Meth Enzvmol (1980) 
65:499, Problems with band compression, which are sometimes observed in GC-rich 
regions, were overcome by use of T-deazoguanosine according to Barr et al, 
Biotechniciues (1986) 4:428. 

The enzyme-linked immunosorbent assay (ELISA) can be used to 

25 measure either antigen or antibody concentrations. This method depends upon 
conjugation of an enzyme to either an antigen or an antibody, and uses the bound 
enzyme activity as a quantitative label. To measure antibody, the known antigen is 
fixed to a solid phase (e.g., a microtiter dish, plastic cup, dipstick, plastic bead, or the 
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like), incubated with test serum dilutions, washed, incubated with anti-immunoglobulin 
labeled with an enzyme, and washed again. Enzymes suitable for labeling are known 
in the art, and include, for example, horseradish peroxidase (HRP). Enzyme activity 
bound to the solid phase is usually measured by adding a specific substrate, and deter- 
5 mining product formation or substrate utilization colorimetrically. The enzyme 
activity bound is a direct function of the amount of antibody bound 

To measure antigen, a known specific antibody is fixed to the solid 
phase, the test material containing antigen is added, after an incubation the solid phase 
is washed, and a second enzyme-labeled antibody is added. After washing, substrate 
10 is added, and enzyme activity is measured colorimetrically, and related to antigen 
concentration. 

Proteases of the invention may be assayed for activity by cleaving a 
substrate which provides detectable cleavage products. As the HCV protease normally 
cleaves itself from the genomic polyprotein, one can employ this autocatalytic activity 

15 both to assay expression of the protein and determine activity. For example, if the 
protease is joined to its fusion partner so that the HCV protease N-tenninal cleavage 
signal (Arg-Arg) is included, the expression product will cleave itself into fusion 
partner and active HCV protease. One may then assay the products, for example by 
western blot, to verify that the proteins produced correspond in size to the separate 

20 fusion partner and protease proteins. It is presently preferred to employ small peptide 
p-nitrophenyl esters or methylcoumarins, as cleavage may then be followed by 
spectrophotometric or fluorescent assays. Following the method described by ED. 
Matayoshi et d, Science (1990) 247:231-35, one may attach a fluorescent label to one 
end of the substrate and a quenching molecule to the other end: cleavage is then 

25 determined by measuring the resulting increase in fluorescence. If a suitable enzyme 
or antigen has been employed as the fusion partner, the quantity of protein produced 
may easily be determined. Alternatively, one may exclude the HCV protease N- 
terminal cleavage signal (preventing self-cleavage) and add a separate cleavage sub- 
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strate, such as a fragment of the HCV NS3 domain including the native processing 
signal or a synthetic analog. 

In the absence of this protease activity* the HCV polyprotein should 
remain in its unprocessed form, and thus render the virus noninfectious. Thus, the 
5 protease is useful for assaying pharmaceutical agents for control of HCV, as com- 
pounds which inhibit the protease activity sufficiently will also inhibit viral infectivity. 
Such inhibitors may take the form of organic compounds, particularly compounds 
which mimic the cleavage site of HCV recognized by the protease. Three of the 
putative cleavage sites of the HCV polyprotein have the following amino acid 
10 sequences: 

Val-Ser-Ala-Aig-Arg // Gly-Arg-Glu-De-Leu-Leu-Gly 
Ala-He-Leu-Arg-Aig // His-Val-Gly-Pro- 
Val-Ser-Cys-Gln-Arg // Gly-Tyr- 

15 

These sites are characterized by the presence of two basic amino acids 
immediately before the cleavage site, and are similar to the cleavage sites recognized 
by other flavivirus proteases. Thus, suitable protease inhibitors may be prepared 
which mimic the basic/basic/small neutral motif of the HCV cleavage sites, but 

20 substituting a nonlabile linkage for the peptide bond cleaved in the natural substrate. 
Suitable inhibitors include peptide trifluoromethyl ketones, peptide boronic acids, 
peptide a-ketoesters, peptide difluoroketo compounds, peptide aldehydes, peptide 
diketones, and the like. For example, the peptide aldehyde N-acetyl-phenylalanyl- 
glycinaldehyde is a potent inhibitor of the protease papain. One may conveniently 

25 prepare and assay large mixtures of peptides using the methods disclosed in U.S. 
Patent application Serial No. 7/189,318, filed 2 May 1988 (published as PCT 
WO89/10931), incorporated herein by reference. This application teaches methods for 
generating mixtures of peptides up to hexapeptides having all possible amino acid 
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sequences, and further teaches assay methods for identifying those'peptides capable of 
binding to proteases. 

Other protease inhibitors may be proteins, particularly antibodies and 

* 

antibody derivatives. Recombinant expression systems may be used to generate 

5 quantities of protease sufficient for production of monoclonal antibodies (MAbs) 
specific for the protease. Suitable antibodies for protease inhibition will bind to the 
protease in a manner reducing or eliminating the enzymatic activity, typically by 
obscuring the active site. Suitable MAbs may be used to generate derivatives, such as 
Fab fragments, chimeric antibodies, altered antibodies, univalent antibodies, and single 

10 domain antibodies, using methods known in the art 

Protease inhibitors are screened using methods of the invention. In 
general, a substrate is employed which mimics the enzyme's natural substrate, but 
which provides a quantifiable signal when cleaved. The signal is preferably detectable 
by colorimetric or fiuorometric means: however, other methods such as HPLC or 

15 silica gel chromatography, GC-MS, nuclear magnetic resonance, and the like may also 
be useful. After optimum substrate and enzyme concentrations are determined, a 
candidate protease inhibitor is added to the reaction mixture at a range of 
concentrations. The assay conditions ideally should resemble the conditions under 
which the protease is to be inhibited in vivo, Le. t under physiologic pH, temperature, 

20 ionic strength, etc. Suitable inhibitors will exhibit strong protease inhibition at con- 
centrations which do not raise toxic side effects in the subject Inhibitors which 
compete for binding to the protease active site may require concentrations equal to or 
greater than the substrate concentration, while inhibitors capable of binding irrev- 
ersibly to the protease active site may be added in concentrations on the order of the 

25 enzyme concentration. 

In a presently preferred embodiment, an inactive protease mutein is 
employed rather than an active enzyme. It has been found that replacing a critical 
residue within the active site of a protease (e.g„ replacing the active site Ser of a 
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serine protease) does not significantly alter the structure of the enzyme, and thus 
preserves the binding specificity. The altered enzyme still recognizes and binds to its 
proper substrate, but fails to effect cleavage. Thus, in one method of the invention an 
inactivated HCV protease is immobilized, and a mixture of candidate inhibitors added. 
5 Inhibitors that closely mimic the enzyme's preferred recognition sequence will 
compete more successfully for binding than other candidate inhibitors. The poorly- 
binding candidates may then be separated, and the identity of the strongly-binding 
inhibitors determined For example, HCV protease may be prepared substituting Ala 
for Ser 121 (Fig* 1), providing an enzyme capable of binding the HCV protease sub- 

10 strate, but incapable of cleaving it The resulting protease mutein is then bound to a 
solid support, for example Sephadex® beads, and packed into a column. A mixture of 
candidate protease inhibitors in solution is then passed through the column and 
fractions collected. The last fractions to elute will contain the strongest-binding 
compounds, and provide the preferred protease inhibitor candidates. 

15 Protease inhibitors may be administered by a variety of methods, such 

as intravenously, orally, intramuscularly, intraperitoneally, bronchially, intranasally, 
and so forth. The preferred route of administration will depend upon the nature of the 
inhibitor. Inhibitors prepared as organic compounds may often be administered orally 
(which is generally preferred) if well absorbed. Protein-based inhibitors (such as most 

20 antibody derivatives) must generally be administered by parenteral routes. 



C. Examples 

The examples presented below are provided as a further guide to the 
practitioner of ordinary skill in the art, and are not to be construed as limiting the 
25 invention in any way. 
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Sample 1 

* 

(Preparation of HCV cDNA) 
A genomic library of HCV cDNA was prepared as described in PCT 
WO89/046699 and USSN 7/456,637. TWs library, ATCC accession no. 40394, has 
5 been deposited as set forth below. 

* 

Example 2 

(Expression of the Polypeptide Encoded in Clone 5-1-1.) 
(A) The HCV polypeptide encoded within clone 5-1-1 (see 

10 Example 1) was expressed as a fusion polypeptide with human superoxide dismutase 
(SOD). This was accomplished by subcloning the clone 5-1-1 cDNA insert into the 
expression vector pSODCFl (K.S. Steimer et al, J Virol (1986) 58:9; EPO 138,111) 
as follows. The SOD/5-1-1 expression vector was transformed into E. coli D1210 
cells. These cells, named Cfl/5-1-1 in E. coli, woe deposited as set forth below and 
15 have an ATCC accession no. of 67967. 

First, DNA isolated from pSODCFl was treated with BamHI and 
EcoRI, and the following linker was ligated into the linear DNA created by the 
restriction enzymes: 

GAT CCT GG A ATT CTG ATA AGA CCT TAA G AC TAT TIT AA 

After cloning, the plasmid containing the insert was isolated. 

Plasmid containing the insert was restricted with EcoRI. The HCV 
cDNA insert in clone 5-1-1 was excised with EcoRI, and ligated into this EcoRI lin- 
earized plasmid DNA. The DNA mixture was used to transform E. coli strain D1210 
(Sadler et al, Gene (1980) 8:279). Recombinants with the 5-1-1 cDNA in the correct 
orientation for expressing the ORF shown in Figure 1 were identified by restriction 
mapping and nucleotide sequencing. 

Recombinant bacteria from one clone were induced to express the SOD- 
HCV 5_1_1 polypeptide by growing the bacteria in the presence of IPTG. 
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Three separate expression vectors, pcflAB, pcflCD, and pcflEF were 
created by ligating three new linkers, AB, CD, and EF to a BamHI-EcoRI fragment 
derived by digesting to completion the vector pSODCFl with EcoRI and BamHI, 
followed by treatment with alkaline phosphatase. Hie linkers were created from six 
5 oligomers, A, B, C D, E, and F. Each oligomer was phosphorylated by treatment 
with kinase in the presence of ATP prior to annealing to its complementary oligomer. 
The sequences of the synthetic linkers were the following: 

Name DNA Sequence (5* to 3*) 

10 A GATC CTG AAT TCC TOA TAA 

B GAC TTA AGO ACT ATTTTAA 

C GATC CGA ATT CTG TGA TAA 

D GCT TAA GAC ACT ATTTTAA 

15 

E GATC CTG GAA TTC TGA TAA 

F GAC CTT AAG ACT ATTTTAA 

* * * 

Each of the three linkers destroys the original EcoRI site, and creates a 
new EcoRI site within the linker, but within a different reading frame. Thus, the HCV 
cDNA EcoRI fragments isolated from the clones, when inserted into the expression 
5 vector, were in three different reading frames. 

The HCV cDNA fragments in the designated Xgtl 1 clones were excised 
by digestion with EcoRI; each fragment was inserted into pcflAB, pcflCD, and 
pcflEF. These expression constructs were then transformed into D1210 2s. coli cells, 
the transformants cloned, and polypeptides expressed as described in part B below. 
10 (B) Expression products of the indicated HCV cDNAs were 

tested for antigenicity by direct immunological screening of die colonies, using a 
modification of the method described in Helfman et al, Proc Nat Acad Sci USA 
(1983), 80:31. Briefly, the bacteria were plated onto nitrocellulose filters overlaid on 
ampicillin plates to give approximately 40 colonies per filter. Colonies were replica 
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plated onto nitrocellulose filters, and the replicas were regrown overnight in the pres- 
ence of 2 mM IPTG and ampicillin. The bacterial colonies were lysed by suspending 
the nitrocellulose filters for about 15 to 20 min in an atmosphere saturated with 
CHCl^ vapor. Each filter then was placed in an individual 100 mm Petri dish contain- 
5 ing 10 mL of 50 mM Tris HC1, pH 7.5, 150 mM NaCl, 5 mM MgCl^ 3% (w/v) BS A, 
40 pg/mL lysozyme, and 0.1 pg/mL DNase. The plates were agitated gently for at 
least 8 hours at room temperature. The filters were rinsed in TBST (50 mM Tris HC1, 
pH 8.0 f 150 mM Nad, 0.005% Tween® 20). After incubation, the cell residues were 
rinsed and incubated for one hour in TBS (TBST without Tween®) containing 10% 

10 sheep serum. The filters were then incubated with pietreated sera in TBS from 
individuals with NANBH, which included 3 chimpanzees; 8 patients with chronic 
NANBH whose sera were positive with respect to antibodies to HCV C100-3 
polypeptide (also called C100); 8 patients with chronic NANBH whose sera were 
negative for anti-ClOO antibodies; a convalescent patient whose scrum was negative 

15 for anti-ClOO antibodies; and 6 patients with community-acquired NANBH, including 
one whose sera was strongly positive with respect to anti-ClOO antibodies, and erne 
whose sera was marginally positive with respect to anti-ClOO antibodies. The sera, 
diluted in TBS, was pretreated by preabsorption with hSOD for at least 30 minutes at 
37°C After incubation, the filters were washed twice for 30 min with TBST. The 

20 expressed proteins which bound antibodies in the sera were labeled by incubation for 2 
hours with 125 I-labeled sheep anti-human antibody. After washing, the filters were 
washed twice for 30 min with TBST, dried, and autoradiographed. 

Example 3 

25 (Cloning of FuU-Length SOD-Protease Fusion Proteins) 

(A) PBR322-C200: 

The nucleotide sequences of the HCV cDNAs used below were deter- 
mined essentially as described above, except that the cDNA excised from these phages 
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wcie substituted for the cDNA isolated from clone 5-1-1. 

Clone C33c was isolated using a hybridization probe having the follow- 
ing sequence: 

5' ATC AGG ACC GGG GTG AGA ACA ATT ACC ACT 3* 
5 The sequence of the HCV cDNA in clone C33c is shown in Figure 8, which also 
shows the amino acids encoded therein. 

Clone 35 was isolated by screening with a synthetic polynucleotide hav- 
ing the sequence: 

5' AAG CCA CCG TGT GCG CTA GGG CTC AAG CCC 3' 
10 Approximately 1 in 50,000 clones hybridized with the probe. The polynucleotide and 
deduced amino acid sequences for C35 are shown in Figure 7. 

Clone C31 is shown in Figure 6, which also shows the amino acids 
encoded therein. A C200 cassette was constructed by ligating together a 718 bp frag- 
ment obtained by digestion of clone C33c DNA with EcoRI and Hinfl, a 179 bp 
15 fragment obtained by digestion of clone C31 DNA with Hinfl and Bgll, and a 377 bp 
fragment obtained by digesting clone C35 DNA with Bgll and EcoRI. The construct 
of ligated fragments were inserted into the EcoRI site of pBR322, yielding the plasmid 
pBR322-C200. 

(B) C7ffC20c: 
20 Clone 7f was isolated using a probe having the sequence: 

5'-AGC AGA CAA GGG GCC TCC TAG GGT GCA TAA T-3* 
The sequence of HCV cDNA in clone 7f and the amino acids encoded therein are 
shown in Figure 5. 

Clone C20c is isolated using a probe having the following sequence: 
25 5'-TGC ATC AAT GGG GTG TGC TGG-3' 

The sequence of HCV cDNA in clone C20c t and the amino acids 
encoded therein are shown in Figure 2. 
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Clones 7f and C20c were digested* with EcoRI and SfaNI to form 400 
bp and 260 bp fragments, respectively. The fragments were then cloned into the 
EcoRI site of pBR322 to form the vector C7f+C20c, and transformed into HB101 cells. 

(O C300: 

5 Clone 8h was isolated using a probe based on the sequence of nucleo- 

tides in clone 33c. The nucleotide sequence of the probe was 

5*-AGA GAC AAC CAT GAG GTC CCC GGT GTT C-3\ 
The sequence of the HCV cDNA in clone 8h v and the amino acids encoded therein, 
are shown in Figure 4. 
10 Clone C26d is isolated using a probe having the following sequence: 

5*-CTG TTG TGC CCC GCG GCA GCC-3' 
The sequence and amino acid translation of clone C26d is shown in 

Figure 3. 

Clones C26d and C33c (see part A above) were transformed into the 
15 methylation minus E. coli strain GM48. Clone C25d was digested with EcoRII and 
Ddel to provide a 100 bp fragment Clone C33c was digested with EcoRII and EcoRI 
to provide a 700 bp fragment Clone C8h was digested with EcoRI and Ddel to 
provide a 208 bp fragment These three fragments were then ligated into the EcoRI 
site of pBR322, and transformed into E. coli HB101, to provide the vector C300. 
20 (D) Preparation of Full Length Clones: 

A 600 bp fragment was obtained from C7f+C20c by digestion with 
EcoRI and Nael, and ligated to a 945 bp NaelZEcoRI fragment from C300, and the 
construct inserted into the EcoRI site of pGEM4Z (commercially available from 
Promega) to form the vector C7fC20cC300. 
25 C7fC20cC300 was digested with Ndel and EcoRI to provide a 892 bp 

fragment, which was ligated with a 1160 bp fragment obtained by digesting C200 with 
NdeT and EcoRI. The resulting construct was inserted into the EcoRI site of pBR322 
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to provide the vector C7fC20cC300C200. Construction of this vector is illustrated 
schematically in Figure 9. 

Example 4 

5 (Preparation of E. coU Expression Vectors) 

(A) cflSODo600: 

This vector contains a full-length HCV protease coding sequence fused 
to a functional hSOD leader. The vector C7fC20cC300C200 was cleaved with EcoRI 
to provide a 2000 bp fragment, which was men ligated into the EcoRI site of plasmid 

10 cflCD (Example 2A). The resulting vector encodes amino acids 1-151 of hSOD, and 
amino acids 946-1630 of HCV (numbered from the beginning of the polyprotein, cor- 
responding to amino acids 1-686 in Figure 1). The vector was labeled cflSODp600 
(sometimes referred to as P600), and was transformed into E. cott D1210 cells. These 
cells, ATCC accession no. 68275, were deposited as set forth below. 

15 (B) P190: 

A truncated SOD-protease fusion polynucleotide was prepared by excis- 
ing a 600 bp EcoRl/Nael fragment from C7f+C20c t blunting the fragment with 
Klenow fragment, ligating the blunted fragment into the Klenow-blunted EcoRI site of 
cflEF (Example 2A). This polynucleotide encodes a fusion protein having ammo 

20 acids 1-151 of hSOD, and amino acids 1-199 of HCV protease. 

(C) P300: 

A longer truncated SOD-protease fusion polynucleotide was prepared by 
excising an 892 bp EcoRI/Ndel fragment from C7fC20cC300, blunting the fragment 
with Klenow fragment, ligating the blunted fragment into the Klenow-blunted EcoRI 
25 site of cflER This polynucleotide encodes a fusion protein having amino acids 1-151 
of hSOD, and amino acids 1-299 of HCV protease. 



WO 91/15596 



PCT/US91/02209 



-30- 



(D) P500: 

A longer truncated SOD-protease fusion polynucleotide was prepared by 
excising a 1550 bp EcoRIflEcbRI fragment from C7fC20cC300, and ligating the 
fragment into the EcoRI site of cflCD to form P500. This polynucleotide encodes a 
5 fusion protein having amino acids 1-151 of hSOD, and amino acids 946-1457 of HCV 
protease (amino acids 1-513 in Figure 1). 

(E) FLAG/Protease Fusion 

This vector contains a full-length HCV protease coding sequence fused 
to the FLAG sequence, Hopp et al. (1988) Biotechnology 6: 1204-1210. PCR was 
10 used to produce a HCV protease gene with special restriction ends for cloning ease. 
Plasmid p500 was digested with EcoRI and Ndel to yield a 900 bp fragment This 
fragment and two primers were used in a polymerase chain reaction to introduce a 
unique Bgin site at amino acid 1009 and a stop codon with a Sail site at amino acid 
1262 of the HCV-1, as shown in Figure 17 of WO 90/11089, published 4 October 
15 1990. The sequent of the primers is as follows: 

5* CCC GAG CAAGAT CTC CCG GCC C 3* 
and 

5* CCC GGC TGC ATA AGC AGT CGA CTT GGA 3* 
After 30 cycles of PCR, the reaction was digested with Bgin and Sail, and the 710 bp 
20 fragment was isolated. This fragment was annealed and ligated to the following 
duplex: 

MetAspTyrLysAspAspAspAspLysGlyArgGlu 
CATGGACTACAAAGACGATGACGATAAAGGCCGGGA 
25 CTGATGTTTCTGCTACTGCTATTTCCGGCCCTCTAG 

The duplex encodes the FLAG sequence, and initiator methionine, and a 5* Ncol 
restriction site. The resulting Ncol/Sall fragment was ligated into a derivative of 
pCFl, which lacks the SOD gene and contains an optimized ribosome binding site for 
30 enhanced translational efficiency. 
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This construct is then transformed into E. coli D1210 cells and expression of 
the protease is induced by the addition of IPTG. 

The FLAG sequence was fused to the HCV protease to facilitate purification. 
A calcium dependent monoclonal antibody, which binds to the FLAG encoded peptide, 
5 is used to purify the fusion protein without harsh eluting conditions. 

Example 5 

(E. coli Expression of SOD-Protease Fusion Proteins) 
(A) E. coli D1210 cells were transformed with cflSODp600 and grown in 

10 Luria broth containing 100 pg/mL ampicillin to an OD of 0.3-0.5. IPTG was then 
added to a concentration of 2 mM, and the cells cultured to a final OD of 0.9 to 1.3. 
The cells were then lysed, and the lysate analyzed by Western blot using anti-HCV 
sera, as described in USSN 7/456,637. 

The results indicated the occurrence of cleavage, as no full length product 

15 (theoretical Mr 93 kDa) was evident on the geL Bands corresponding to the hSOD 
fusion partner and the separate HCV protease appeared at relative molecular weights 
of about 34, 53, and 66 kDa. The 34 kDa band corresponds to the hSOD partner 
(about 20 kDa) with a portion of the NS3 domain, while the 53 and 66 kDa bands cor- 
respond to HCV protease with varying degrees of (possibly bacterial) processing. 

20 (B) E. coli D1210 cells were transformed with P500 and grown in Luria 

broth containing 100 pg/mL ampicillin to an OD of 0.3-0.5. IPTG was then added to 
a concentration of 2 mM, and the cells cultured to a final OD of 0.8 to 1.0. The cells 
were then lysed, and the lysate analyzed as described above. 

The results again indicated the occurrence of cleavage, as no full length 

25 product (theoretical Mr 73 kDa) was evident on the gel. Bands corresponding to the 
hSOD fusion partner and the truncated HCV protease appeared at molecular weights 
of about 34 and 45 kDa, respectively. 
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(C) E. coli D1210 cells were transformed with vectors P300 and P190 and 
grown as described above. 

The results from P300 expression indicated the occurrence cf cleavage, as no 
full length product (theoretical Mr 51 kDa) was evident on the gel. A band 
5 corresponding to the hSOD fusion partner appeared at a relative molecular weight of 
about 34 The corresponding HCV protease band was not visible, as this region of the 
NS3 domain is not recognized by the sera employed to detect the products. However, 
appearance of the hSOD band at 34 kDa rather than 51 kDa indicates that cleavage 
occurred. 

10 The P190 expression product appeared only as the full (encoded) length 

product without cleavage, forming a band at about 40 kDa, which corresponds to the 
theoretical molecular weight for the uncleaved product. This may indicate that the 
minimum essential sequence for HCV protease extends to the region between amino 
acids 199 and 299. 

15 

Example 6 

(Purification of E. coli Expressed Protease) 
The HCV protease and fragments expressed in Example 5 may be purified as 
follows: 

20 Hie bacterial cells in which the polypeptide was expressed are subjected to 

osmotic shock and mechanical disruption, the insoluble fraction containing the protease 
is isolated and subjected to differential extraction with an alkaline-NaCl solution, and 
the polypeptide in the extract purified by chromatography on columns of S- 
Sepharose® and Q-Sepharose®. 

25 The crude extract resulting from osmotic shock and mechanical disruption is 

prepared by suspending 1 g of the packed cells in 10 mL of a solution containing 0.02 
M Tris HQ, pH 7.5, 10 mM EDTA, 20% sucrose, and incubating for 10 minutes on 
ice. The cells are then pelleted by centrifugation at 4,000 x g for 15 min at 4°G. 
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After the supernatant is removed, the cell pellets are resuspended in 10 mL of Buffer 
Al (0.01 M Tris HC1, pH 7.5, 1 mM EDTA, 14 mM P-mercaptoethanol - "PME"), 
and incubated on ice for 10 minutes. The cells are again pelleted at 4,000 x g for 15 
minutes at 4°G. After removal of the clear supernatant (periplasmic fraction I), the 

5 cell pellets are resuspended in Buffer Al, incubated on ice for 10 minutes, and again 
centrifuged at 4,000 x g for 15 minutes at 4°G. The clear supernatant (periplasmic 
fraction IT) is removed, and the cell pellet resuspended in 5 mL of Buffer T2 (0.02 M 
Tris HC1, pH 7.5, 14 mM pME, 1 mM EDTA, 1 mM PMSF). In order to disrupt the 
cells, the suspension (5 mL) and 7.5 mL of Dyno-mill lead-free acid washed glass 

10 beads (0.10-0.15 mm diameter) (available from Glen-Mills, Inc.) are placed in a 
Falcon tube and vortexed at top speed for two minutes, followed by cooling for at 
least 2 min on ice. The vortexing-cooling procedure is repeated another four times. 
After vortexing, the slurry is filtered through a sintered glass funnel using low suction, 
the glass beads washed twice with Buffer A2, and the filtrate and washes combined. 

15. The insoluble fraction of the crude extract is collected by centrifugation at 

20,000 x g for 15 min at 4°C, washed twice with 10 mL Buffer A2, and resuspended 

in 5 mL of MILLI-Q water. 

A fraction containing the HCV protease is isolated from the insoluble material 

by adding to the suspension NaOH (2 M) and NaCl (2 M) to yield a final concentation 
20 of 20 mM each, vortexing the mixture for 1 minute, centrifuging it 20,000 x g for 20 

min at 4°C, and retaining the supernatant 

The partially purified protease is then purified by SDS-PAGE. The protease 

may be identified by western blot, and the band excised from the gel. The protease is 

then eluted from the band, and analyzed to confirm its amino acid sequence. N- 
25 terminal sequences may be analyzed using an automated amino acid sequencer, while 

C-terminal sequences may be analyzed by automated amino acid sequencing of a 

series of tryptic fragments. 
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Example 7 

(Preparation of Yeast Expression Vector) 
(A) P650 (SOD/Protease Fusion) 

This vector contains HCV sequence, which includes the wild-type full-length 
5 HCV protease coding sequence, fused at the 5* end to a SOD coding sequence. Two 
fragments, a 441 bp EcoRIZBgin fragment from clone lib and a 1471 bp BglD/EcoRI 
fragment from expression vector P500, were used to reconstruct a wild-type, full- 
length HCV protease coding sequence. These two fragments were ligated together 
with an EcoRI digested pS356 vector to produce an expression cassette. The 

10 expression cassette encodes the ADH2/GAPDH hybrid yeast promoter, human SOD, 
the HCV protease, and a GAPDH transcription terminator. The resulting vector was 
digested with BamHI and a 4052 bp fragment was isolated. This fragment was ligated 
to the BamHI digested pAB24 vector to produce p650. p650 expresses a polyprotein 
containing, from its amino terminal end, amino acids 1-154 of hSOD, an 

15 oligopeptide -Asn-Leu-Gly-He-Aig- , and amino acids 819 to 1458 of HCV-1, as 
shown in Hgure 17 of WO 90/11089, published 4 October 1990. 

Clone lib was isolated from the genomic library of HCV cDNA, ATCC 
accession no. 40394, as described above in Example 3 A, using a hybridization probe 
having the following sequence: 

20 5 f CAC CTA TGT TTA TAA CCA TCT CAC TCC TCT 3\ 

This procedure is also described in EPO Pub. No. 318 216, Example IV.A17. 

The vector pS3EF, which is a pBR322 derivative, contains the ADH2/GAPDH 
hybrid yeast promoter upstream of the human superoxide dimutase gene, an adaptor, 
and a downstream yeast effective transcription terminator. A similar expression vector 

25 containing these control elements and the superoxide dismutase gene is described in 
Cousens et al. (1987) Gene 61: 265, and in copending application EPO 196,056, 
published October 1, 1986. pS3EF, however, differs from that in Cousens et al. in 
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that the heterologous ptoinsulin gene and the immunoglobulin hinge are deleted, and 
Gln m of SOD is followed by an 

adaptor sequence which contains an EcoRI site. The sequence of the adaptor is: 

5' AAT TTG GGA ATT CCA TAA TTA ATT AA6 3' 
5 3 ' AC CCT TAA GGT ATT AAT TAA TTC AGCT 5' 

The EcoRI site facilitates the insertion of heterologous sequences. Once inserted into 
pS3EF, a SOD fusion is expressed which contains an oligopeptide that links SOD to 
the heterologous sequences. pS3EF is exactly the same as pS356 except that pS356 
contains a different adaptor. The sequence of the adaptor is shown below: 

10 5' AAT TTG GGA ATT CCA TAA TGA G 3' 

3' AC CCT TAA GGT ATT ACT CAG CT 5' 

pS356 9 ATCC accession no. 67683, is deposited as set forth below. 

Plasmid pAB24 is a yeast shuttle vector, which contains pBR322 
sequences, the complete 2p sequence for DNA replication in yeast (Broach (1981) in: 

15 Molecular Biology of the Yeast Saccharomvces, Vol. 1, p. 445, Cold spring Harbor 
Press.) and the yeast LEU 24 gene derived from plasmid pCl/1, described in EPO Pub. 
No. 116 2G1. Fiasmid pAB24 was constructed by digesting YEp24 with EcoRI and 
re-ligating the vector to remove the partial 2 micron sequences. The resulting plasmid, 
YEp24deltaRI, was linearized with Clal and ligated with the complete 2 micron 

20 plasmid which had been linearized with Clal. The resulting plasmid, pCBou, was then 
digested with Xbal, and the 8605 bp vector fragment was gel isolated. This isolated 
Xbal fragment was ligated with a 4460 bp Xbal fragment containing the LEU 211 gene 
isolated from pCl/1; the orientation of LEU 24 gene is in the same direction as the 
URA3 gene. 

25 S. cerevisae, 2150-2-3 (pAB24-GAP-env2), accession no. 20827, is 

deposited with the American Type Culture Collection as set forth below. The plasmid 
pAB24-GAP-env2 can be recovered from the yeast cells by known techniques. The 
GAP-env2 expression cassette can be removed by digesting pAB24-GAP-env2 with 
BamHI. pAB24 is recovered by rcligating the vector without the BamHI insert. 
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Example 8 

(Yeast Expression of SOD-Ptotease Fusion Protein) 
p650 was transformed in 5. cerevisae strain JSC310, Mata, leu2, ura3- 
52, prbl-1122, pep4-3, prcl-407, cir°: DM15 (g418 resistance). The transformation is 
5 as described by Hinnen et al. (1978) Proc Natl Acad Sci USA 75: 1929, The 
transformed cells were selected on ma- plates with 8% glucose. The plates were 
incubated at 30°C for 4-5 days. The tranfbrmants were further selected on leu- plates 
with 8% glucose putatively for high numbers of the p650 plasmid. Colonies from the 
leu- plates were inoculated into leu- medium with 3% glucose. These cultures were 
10 shaken at 30°C for 2 days and then diluted 1/20 into YEPD medium with 2% glucose 
and shaken for 2 more days at 30°C. 

S. cerevisae JSC310 contains DM15 DNA, described in EPO Pub. No. 
340 986, published 8 NOvember 1989. This DM15 DNA enhances ADH2 regulated 
expression of heterologous proteins. pDM15 f accession no. 40453, is deposited with 
15 the American Type Culture Collection as set forth below. 

Example 9 

(Yeast Ubiquitin Expression of Mature HCV Protease) 
Mature HCV protease is prepared by cleaving vector 
20 C7fC20cC300C200 with EcoRI to obtain a 2 Kb coding sequence, and inserting the 
sequence with the appropriate linkers into a ubiquitin expression vector, such as that 
described in WO 88/02406, published 7 April 1988, or USSN 7/390,599 filed 7 
August 1989, incorporated herein by reference. Mature HCV protease is recovered 
upon expression of the vector in suitable hosts, particularly yeast. Specifically, the 
25 yeast expression protocol described in Example 8 is used to express a ubiquitin/HCV 
protease vector. 
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Kxam ple 10 

(Preparation of an In- Vitro Expression Vector) 

(A) pGEM®-32/Yellow Fever Leader Vector 

Four synthetic DNA fragments were annealed and ligated** together to 
5 create a EcoRI/SacI Yellow Fever leader, which was ligated to a EcoRI/SacI digested 
pGEM®-3Z vector from Promega®. Hie sequence of the four fragments are listed 
below: 
YFK-1: 

5' AAT TCG TAA ATC CTG TGT GCT AAT TGA GGT GCA TTG GTC TGC 
10 AAA TCG AGT TGC TAG GCA ATA AAC ACA TT 3 ' 
YFK-2: 

5* TAT TGC CTA GCA ACT CGA TTT GCA GAC CAA TGC ACC TCA ATT 

AGC ACA CAG GAT TTA CG 3* 

YFK-3: 

15 5' TGG ATT AAT TTT AAT CGT TCG TTG AGC GAT TAG CAG AGA ACT 
GAC CAG AAC ATG TCT GAG CT 3* 
YFK-4: 

5' CAG ACA TGT TCT GGT CAG TTC TCT GCT AAT CGC TCA ACG AAC 
GAT TAA AAT TAA TCC AAA TGT GTT 3*. 
20 For in- vitro translation of the HCV protease, the new pGEM®- 

32/Yellow Fever leader vector was digested with BamHI and blunted with Klenow. 

(B) PvuII Construct from d6000 

A clone p6000 was constructed from sequences available from the 
genomic library of HCV cDNA, ATCC accession' no. 40394. The HCV encoding 
25 DNA sequence of p6000 is identical to nucleotide -275 to nucleotide 6372 of Figure 
17 of WO 90/11089, published 4 October 1990. p6000 was digested with PvuII, and 
from the digest, a 2,864 bp fragment was isolated. This 2,864 bp fragment was 
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ligated to the prepared pGEM®-3Z/Yellow Fever leader vector fragment, described 
above. 



10 



Example 11 
(In-Vitro Expression of HCV Protease) 

(A) Transcription 

The pGEM®-32VYellow Fever leader/PvuII vector was linearized with 
Xbal and transcribed using the materials and protocols from Promega's Riboprobe® 
Gemini II Core system. 

(B) Translation 

The UNA produced by the above protocol was translated using 
Promega's rabbit reticulocyte lysate, minus methionine, canine pancreatic microsomal 
membranes, as well as, other necessary materials and instructions from Promega. 



15 



Deposited Biological Materials: 

The following materials were deposited with the American Type Culture 
Collection (ATCC), 12301 Parklawn Dr., Rockville, Maryland: 



20 



25 



30 



Name 

E. coli D1210, cf lSODp600 

Cfl/5-1-1 in E. coli D1210 

Bacteriophage X-gtll cDNA 
library 

E. coli HB101, pS356 

plasmid DNA, pDM15 

S. cerevisae, 2150-2-3 
(pAB24-GAP-env2) 



Deposit Date 
23 Mar 1990 

11 May 1989 

01 Dec 1987 



29 Apr 1988 
05 May 1988 
23 Dec 1986 



Accession No. 
68275 

67967 
40394 



67683 
40453 
20827 
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The above materials have been deposited with the ATCC under the accession 
numbers indicated. These deposits will be maintained under the terms of the Budapest 
Treaty on the International Recognition of the Deposit of Microorganisms for purposes 
of Patent Procedure. These deposits are provided as a convenience to those of skill in 
5 the art, and are not an admission that a deposit is required under 35 IXS.C. §112. The 
polynucleotide sequences contained in the deposited materials, as well as the amino 
acid sequence of the polypeptides encoded thereby, are incorporated herein by 
reference and are controlling in the event of any conflict with the sequences described 
herein. A license may be required to make, use or sell the deposited materials, and no 
10 such license is granted hereby. 
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WH AT IS CLAIMED: 



1. A method for assaying compounds for activity against Hepatitis C 
vims, which method comprises: 
5 providing a proteolytically inactive HCV protease analog; 

contacting said inactive HCV protease analog with a mixture of candidate anti- 
HCV compounds; and 

determining which candidate compounds bind to said HCV protease analog. 



10 2. The method of claim 1, wherein said inactive HCV protease analog 

has substantially the following sequence: 

Arg Arg Gly Arg Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
Lys Gly lip Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr 
Arg Gly Leu Leu Gly Cys lie lie Thr Ser Leu Hit Gly Arg Asp Lys 

15 Asn Gin Val Glu Gly Glu Val Gin lie Val Ser Thr Ala Ala Gin Thr Phe 

Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala 
Gly Thr Arg Thr lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr 
Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg 
Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr 

20 Aig His Ala Asp Val He Pro Val Arg Aig Arg Gly Asp Ser Arg Gly 

Ser Leu Leu Ser Pro Aig Pro He Ser Tyr Leu Lys Gly Ser Ala Gly Gly 
Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He Phe Aig Ala Ala Val 
Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Asn 
Leu Glu Thr Thr Met Aig. 

25 

3. The method of claim 1, wherein said inactive HCV protease analog 
has substantially the following sequence: 

Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala His 
Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe Ser 
30 Gin Met Glu Thr Lys Leu He Thr Trp Gly Ala Asp Thr Ala Ala Cys 

Gly Asp He He Asn Gly Leu Pro Val Ser Ala Arg Arg Gly Arg Glu He 
Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp Arg Leu Leu 
Ala Pro He Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu Gly Cys He 
He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu Val 
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1 

Gly Thr 

ATT CGG GGC ACC 
TAA GCC CCG TGG 



5 

Tyr Val Tyr Asn 

TAT GTT TAT AAC 
ATA CAA ATA TTG 



10 

His Leu Thr Pro 

CAT CTC ACT CCT 
GTA GAG TGA GGA 



Leu Arg Asp Trp 

CTT CGG GAC TGG 
GAA GCC CTG ACC 



15 

Ala His Asn Gly 

GCG CAC AAC GGC 
CGC GTG TTG CCG 



20 

Leu Arg Asp Leu 

TTG CGA GAT CTG 
AAC GCT CTA GAC 



25 

Ala Val Ala Val 

GCC GTG GCT GTA 
CGG CAC CGA CAT 



30 

Glu Pro Val Val 

GAG CCA GTC GTC 
CTC GGT CAG CAG 



Phe Ser Gin Met 

TTC TCC CAA ATG 
AAG AGG GTT TAC 



35 

Glu Thr Lys Leu 

GAG ACC AAG CTC 
CTC TGG TTC GAG 



40 

lie Thr Trp Gly 

ATC ACG TGG GGG 
TAG TGC ACC CCC 



45 

Ala Asp Thr Ala 

GCA GAT ACC GCC 
CGT CTA TGG CGG 



50 

Ala Cys Gly Asp lie lie Asn Gly 

GCG TGC GGT GAC ATC ATC AAC GGC 
CGC ACG CCA CTG TAG TAG TTG CCG 



55 60 

Leu Pro Val Ser Ala Arg Arg Gly 

TTG CCT GTT TCC GCC CGC AGG GGC 
AAC GGA CAA AGG CGG GCG TCC CCG 



65 70 
Arg Glu lie Leu Leu Gly Pro Ala 
CGG GAG ATA CTG CTC GGG CCA GCC 
GCC CTC TAT GAC GAG CCC GGT CGG 



75 

Asp Gly Met Val Ser Lys Gly Trp 

GAT GGA ATG GTC TCC AAG GGT TGG 
CTA CCT TAC CAG AGG TTC CCA ACC 



80 

Arg Leu Leu Ala 

AGG TTG CTG GCG 
TCC AAC GAC CGC 



85 

Pro lie Thr Ala 

CCC ATC ACG GCG 
GGG TAG TGC CGC 



90 

Tyr Ala Gin Gin 

TAC GCC CAG CAG 
ATG CGG GTC GTC 



Thr Arg Gly Leu 

ACA AGG GGC CTC 
TGT TCC CCG GAG 



95 100 

Leu Gly Cys lie lie Thr Ser Leu 

CTA GGG TGC ATA ATC ACC AGC CTA 

GAT CCC ACG TAT TAG TGG TCG GAT 



105 110 

Thr Gly Arg Asp Lys Asn Gin Val 

ACT GGC CGG GAC AAA AAC CAA GTG 
TGA CCG GCC CTG TTT TTG GTT CAC 



Glu Gly Glu Val 
GAG GGT GAG GTC 
CTC CCA CTC CAG 



115 

Gin lie Val Ser 
CAG ATT GTG TCA 
GTC TAA CAC AGT 



120 

Thr Ala Ala Gin 
ACT GCT GCC CAA 
TGA CGA CGG GTT 



125 

Thr Phe Leu Ala 

ACC TTC CTG GCA 
TGG AAG GAC CGT 



Figure X 




WO 91/15596 2/23 PCT/US91/02209 

130 135 140 

Thr cys lie lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly 

ACG TGC ATC ATC AAT GGG GTG TGC TGG ACT GTC TAG CAC GGG GCC GGA 
TGC ACG TAG TAG TTA CCC CAC ACG ACC TGA CAG ATG GTG CCC CGG CCT 





145 
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Thr Arg Thr lie Ala 


Ser 


Pro Lys 


Gly 


Pro 


Val 


lie 


Gin 


Met 


Tyr Thr 


ACG 


AGG ACC ATC GCG 


TCA 


CCC 


AAG 


GGT 


CCT 


GTC 


ATC 


CAG 


ATG 


TAT 
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TGC 


TCC TGG TAG CGC 


AGT 


GGG 
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CCA 

• 
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CAG 
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Asn Val Asp Gin Asp 


Leu 


Val 


Gly Trp 


Pro 


Ala 


Ser 


Gin Gly 


Thr Arg 


AAT 


GTA GAC CAA GAC 


CTT 


GTG 


GGC 


TGG 


CCC 


GCT 


TCG 


CAA 


GGT 


ACC 
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TTA 


CAT CTG GTT CTG 

* 
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CAC 


CCG 
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GGG 


CGA 
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TGG 


GCG 
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180 
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190 


Ser 


Leu Thr Pro Cys Thr Cys Gly 


Ser 


Ser 


Asp 


Leu 


Tyr 


Leu 


Val 


Thr 


TCA 


TTG ACA CCC TGC 


ACT 


TGC 


GGC 


TCC 


TCG 


GAC 


CTT 


TAC 


CTG 
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ACG 
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TGA 


ACG 
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AGG 
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195 










200 










205 


Arg His Ala Asp Val 


He 


Pro 


val 


Arg 


Arg 


Arg 


Gly 


Asp 


Ser 


Arg Gly 
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ATT 
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GTG 


CGC 


CGG 


CGG 
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GCG 
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t 
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He 
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CTG TTG TGC CCC GCG GGG CAC 
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GGC 


ATA 
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CTG 
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TAG 


GGA 


CAC 



Figure 1 ( continued) 
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255 

Glu Asn Leu Glu 
GAG AAC CTA GAG 
CTC TTG GAT CTC 



260 

Thr Thr Met Arg 

ACA ACC ATG AGG 
TGT TGG TAC TCC 



265 

Ser Pro Val Phe 
TCC CCG GTG TTC 
AGG GGC CAC AAG 



Thr Asp Asn Ser 
ACG GAT AAC TCC 
TGC CTA TTG AGG 



275 280 285 

Ser Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala 

TCT CCA CCA GTA GTG CCC CAG AGC TTC CAG GTG GCT CAC CTC CAT GCT 

AGA GGT GGT CAT CAC GGG GTC TCG AAG GTC CAC CGA GTG GAG GTA CGA 



290 

Pro Thr Gly Ser 

CCC ACA GGC AGC 
GGG TGT CCG TCG 



Gly Lys Ser Thr 

GGC AAA AGC ACC 
CCG TTT TCG TGG 



295 

Lys Val Pro Ala 

AAG GTC CCG GCT 
TTC CAG GGC CGA 



300 . 

Ala Tyr Ala Ala 

GCA TAT GCA GCT 
CGT ATA CGT CGA 
t 

Ndel 



305 310 
Gin Gly Tyr Lys Val Leu Val Leu 
CAG GGC TAT AAG GTG CTA GTA CTC 
GTC CCG ATA TTC CAC GAT CAT GAG 



315 

Asn Pro Ser Val Ala Ala Thr Leu 
AAC CCC TCT GTT GCT GCA ACA CTG 
TTG GGG AGA CAA CGA CGT TGT GAC 



320 325 330 

Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie 

GGC TTT GGT GCT TAC ATG TCC AAG GCT CAT GGG ATC GAT CCT AAC ATC 
CCG AAA CCA CGA ATG TAC AGG TTC CGA GTA CCC TAG CTA GGA TTG TAG 



335 340 

Arg Thr Gly Val Arg Thr He Thr 

AGG ACC GGG GTG AGA ACA ATT ACC 
TCC TGG CCC CAC TCT TGT TAA TGG 



345 350 

Thr Gly Ser Pro He Thr Tyr Ser 

ACT GGC AGC CCC ATC ACG TAC TCC 
TGA CCG TCG GGG TAG TGC ATG AGG 



Thr Tyr Gly Lys 
ACC TAC GGC AAG 
TGG ATG CCG TTC 



355 

Phe Leu Ala Asp 
TTC CTT GCC GAC 
AAG GAA CGG CTG 



360 

Gly Gly Cys Ser 

GGC GGG TGC TCG 
CCG CCC ACG AGC 



365 

Gly Gly Ala Tyr 

GGG GGC GCT TAT 
CCC CCG CGA ATA 



370 375 380 

Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He 

GAC ATA ATA ATT TGT GAC GAG TGC CAC TCC ACG GAT GCC ACA TCC ATC 
CTG TAT TAT TAA ACA CTG CTC ACG GTG AGG TGC CTA CGG TGT AGG TAG 



F±<3\jlx?& 1 (continued) 



WO 91/15596 



4/23 



PCIYUS91/02209 



385 390 

Leu Gly lie Gly Thr Val Leu Asp 

TTG GGC ATT GGC ACT GTC CTT GAC 
AAC CCG TAA CCG TGA CAG GAA CTG 



395 

Gin Ala Glu Thr Ala Gly Ala Arg 

CAA GCA GAG ACT GCG GGG GCG AGA 
GTT CGT CTC TGA CGC CCC CGC TCT 



400 

Leu Val Val Leu 

CTG GTT GTG CTC 
GAC CAA CAC GAG 



405 

Ala Thr Ala Thr 

GCC ACC GCC ACC 
CGG TGG CGG TGG 



410 

Pro Pro Gly Ser 

CCT CCG GGC TCC 
GGA' GGC CCG AGG 



Val Thr Val Pro 

GTC ACT GTG CCC 
CAG TGA CAC GGG 



415 

His Pro Asn He 
CAT CCC AAC ATC 
GTA GGG TTG TAG 



420 

Glu GlU Val Ala 
GAG GAG GTT GCT 
CTC CTC CAA CGA 



425 

Leu Ser Thr Thr 
CTG TCC ACC ACC 
GAC AGG TGG TGG 



430 

Gly Glu lie Pro 
GGA GAG ATC CCT 
CCT CTC TAG GGA 



Phe Tyr Gly Lys 

TTT TAG GGC AAG 
AAA ATG CCG TTC 



435 

Ala He Pro Leu 
GCT ATC CCC CTC 
CGA TAG GGG GAG 



440 

Glu Val He Lys 

GAA GTA ATC AAG 
CTT CAT TAG TTC 



445 

Gly Gly Arg His 

GGG GGG AGA CAT 
CCC CCC TCT GTA 



450 

Leu He Phe Cys 
CTC ATC TTC TGT 
GAG TAG AAG ACA 



His Ser Lys Lys 

CAT TCA AAG AAG 
GTA AGT TTC TTC 



455 

Lys Cys Asp Glu 

AAG TGC GAC GAA 
TTC ACG CTG CTT 



460 

Leu Ala Ala Lys 

CTC GCC GCA AAG 
GAG CGG CGT TTC 



465 

Leu Val Ala Leu 

CTG GTC GCA TTG 
GAC CAG CGT AAC 



470 

Gly He Asn Ala 

GGC ATC AAT GCC 
CCG TAG TTA CGG 



Val Ala Tyr Tyr 

GTG GCC TAC TAC 
CAC CGG ATG ATG 



475 

Arg Gly Leu Asp 

CGC GGT CTT GAC 
GCG CCA GAA CTG 



480 485 490 

Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr Asp 
GTG TCC GTC ATC CCG ACC AGC GGC GAT GTT GTC GTC GTG GCA ACC GAT 
CAC AGG CAG TAG GGC TGG TCG CCG CTA CAA CAG CAG CAC CGT TGG CTA 
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495 

Ala Leu Met Thr 

GCC CTC ATG ACC 
CGG GAG TAC TGG 



500 

Gly Tyr Thr Gly 

GGC TAT ACC GGC 
CCG ATA TGG CCG 



505 

Asp Phe Asp Ser 

GAC TTC GAC TCG 
CTG AAG CTG AGC 



510 

Val . lie Asp Cys 

GTG ATA GAC TGC 
CAC TAT CTG ACG 



515 520 525 

Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe 

AAT ACG TGT GTC ACC CAG ACA GTC GAT TTC AGC CTT GAC CCT ACC TTC 

TTA TGC ACA CAG TGG GTC TGT CAG CTA AAG TCG GAA CTG GGA TGG AAG 



530 

Thr lie Glu Thr 

ACC ATT GAG ACA 
TGG TAA CTC TGT 



He Thr Leu Pro 

ATC ACG CTC CCC 
TAG TGC GAG GGG 



535 

Gin Asp Ala Val 
CAA GAT GCT GTC 
GTT CTA CGA CAG 



540 

Ser Arg Thr Gin 

TCC CGC ACT CAA 
AGG GCG TGA GTT 



545 

Arg Arg Gly Arg 
CGT CGG GGC AGG 
GGA GCC CCG TCC 



550 

Thr Gly Arg Gly 

ACT GGC AGG GGG 
TGA CCG TCC CCC 



Lys Pro Gly He 
AAG CCA GGC ATC 
TTC GGT CCG TAG 



555 

Tyr Arg Phe Val 

TAC AGA TTT GTG 
ATG TCT AAA CAC 



560 

Ala Pro Gly Glu 

GCA CCG GGG GAG 
CGT GGC CCC CTC 



565 

Arg Pro Pro Gly 

CGC CCT CCC GGC 
GCG GGA GGG CCG 



570 

Met Phe Asp Ser 

ATG TTC GAC TCG 
TAC AAG CTG AGC 



Ser Val Leu Cys 

TCC GTC CTC TGT 
AGG CAG GAG ACA 



575 

Glu Cys Tyr Asp 
GAG TGC TAT GAC 
CTC ACG ATA CTG 



580 

Ala Gly Cys Ala 

GCA GGC TGT GCT 
CGT CCG ACA CGA 



585 

Trp Tyr Glu Leu 
TGG TAT GAG CTC 
ACC ATA CTC GAG 



590 

Thr Pro Ala Glu 

ACG CCC GCC GAG 
TGC GGG CGG CTC 



Thr Thr Val Arg 

ACT ACA GTT AGG 
TGA TGT CAA TCC 



595 

Leu Arg Ala Tyr 

CTA CGA GCG TAC 
GAT GCT CGC ATG 



600 

Met Asn Thr Pro 

ATG AAC ACC CCG 
TAC TTG TGG GGC 



605 

Gly Leu Pro Val 

GGG CTT CCC GTG 
CCC GAA GGG CAC 
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610 

cys Gin Asp His 

TGC CAG GAC CAT 
ACG GTC CTG GTA 



Leu Glu Phe Trp 
CTT GAA TTT TGG 
GAA CTT AAA ACC 



615 








GlU 


Gly 


val 


Phe 


GAG 


GGC 


GTC 


TTT 


CTC 


CCG 


CAG 


AAA 



620 

Thr Gly Leu Thr 
ACA GGC CTC ACT 
TGT CCG GAG TGA 



625 

His He Asp Ala 
CAT ATA GAT GCC 
GTA TAT CPA CGG 









630 


His 


Phe 


Leu 


Ser 


CAC 


TTT 


CTA 


TCC 


GTG 


AAA 


GAT 


AGG 



Gin Thr Lys Gin 

CAG ACA AAG CAG 
GTC TGT TTC GTC 



635 

Ser Gly Glu Asn 

AGT GGG GAG AAC 
TCA CCC CTC TTG 



640 

Leu Pro Tyr Leu 

CTT CCT TAC CTG 
GAA GGA ATG GAC 







645 




val 


Ala 


Tyr 


Gin 


GTA 


GCG 


TAC 


CAA 


CAT 


CGC 


ATG 


GTT 



650 

Ala Thr Val Cys 
GCC ACC GTG TGC 
CGG TGG CAC ACG 



Ala Arg Ala Gin 

GCT AGG GCT CAA 
CGA TCC CGA GTT 



655 

Ala Pro Pro Pro 
GCC CCT CCC CCA 
CGG GGA GGG GGT 



660 

Ser Trp Asp Gin 
TCG TGG GAC CAG 
AGC ACC CTG GTC 







665 




Met 


Trp 


Lys 


Cys 


ATG 


TGG 


AAG 


TGT 


TAC 


ACC 


TTC 


ACA 



670 

Leu lie Arg Leu 
TTG ATT CGC CTC 
AAC TAA GCG GAG 



Lys Pro Thr Leu 

AAG CCC ACC CTC 
TTC GGG TGG GAG 



675 








His 


Gly 


Pro 


Thr 


CAT 


GGG 


CCA 


ACA 


GTA 


CCC 


GGT 


TGT 





680 






Pro 


Leu 


Leu 


Tyr 


CCC 


CTG 


CTA 


TAC 


GGG 


GAC 


GAT 


ATG 



685 

Arg Leu Gly Ala 

AGA CTG GGC GCT 
TCT GAC CCG CGA 
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C20c : 

Asn Ser Glu Asn Gin Val Glu Gly Glu Val Gin lie Val Ser Thr Ala 

AAT TCG GAA AAC CAA GTG GAG GGT GAG GTC CAG ATT GTG TCA ACT GCT 
TTA AGC CTT TTG GTT CAC CTC CCA CTC CAG GTC TAA CAC AGT TGA CGA 
t 

EcoRI 



Ala Gin Thr Phe Leu Ala Thr Cys lie Asn Gly Val Cys Trp Thr Val 

GCC CAA ACC TTC CTG GCA ACG TGC ATC AAT GGG GTG TGC TGG ACT GTC 
CGG GTT TGG AAG GAC CGT TGC ACG TAG TTA CCC CAC ACG ACC TGA CAG 

t 

SfaNI 



Tyr His Gly Ala Gly Thr Arg Thr lie Ala Ser Pro Lys Gly Pro Val 
TAG CAC GGG GCC GGA ACG AGG ACC ATC GCG TCA CCC AAG GGT CCT GTC 
ATG GTG CCC CGG CCT TGC TCC TGG TAG CGC AGT GGG TTC CCA GGA CAG 



lie Gin Met Tyr Thr Asn Val Asp 

ATC CAG ATG TAT ACC AAT GTA GAC 
TAG GTC TAC ATA TGG TTA CAT CTG 



Gin Asp Leu Val Gly Trp Pro Ala 

CAA GAC CTT GTG GGC TGG CCC GCT 
GTT CTG GAA CAC CCG ACC GGG CGA 



Ser Gin Gly Thr Arg Ser Leu Thr 

TCG CAA GGT ACC CGC TCA TTG ACA 
AGC GTT CCA TGG GCG AGT AAC TGT 



Pro Cys Thr Cys Gly Ser Ser Asp 

CCC TGC ACT TGC GGC TCC TCG GAC 
GGG ACG TGA ACG CCG AGG AGC CTG 



Leu Tyr Leu Val Thr Arg His Ala 

CTT TAC CTG GTC ACG AGG CAC GCC 
GAA ATG GAC CAG TGC TCC GTG CGG 



Asp Val lie Pro Val Arg Arg Arg 

GAT GTC ATT CCC GTG CGC CGG CGG 
CTA CAG TAA GGG CAC GCG GCC GCC 

t 

Nael 



Gly Asp Ser Arg Gly Ser Leu Val Ser Pro Arg Pro lie Ser Tyr Leu 

GGT GAT AGC AGG GGC AGC CTC GTG TCG CCC CGG CCC ATT TCC TAC TTG 
CCA CTA TCG TCC CCG TCG GAG CAC AGC GGG GCC GGG TAA AGG ATG AAC 



Lys Gly Ser Ser Gly Gly Pro Leu Pro Asn 

AAA GGC TCC TCG GGG GGT CCG CTG CCG AAT TC 
TTT CCG AGG AGC CCC CCA GGC GAC GGC TTA AG 

t 

EcoRI 
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C26d : 

Glu Phe Gly Gly Leu Leu Leu Cys Pro Ala Ala Ala Val Gly lie Phe 

GAA TTC GGG GGC CTG CTG TTG TGC CCC GCG GCA GCC GTG GGC ATA TTT 
CTT AAG CCC CCG GAC GAC AAC ACG GGG CGC CGT CGG CAC CCG TAT AAA 
t 

JEcoRI 

■ 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe lie 

AGG GCC GCG GTG TGC ACC CGT GGA GTG GCT AAG GCG GTG GAC TTT ATC 
TCC CGG CGC CAC ACG TGG GCA CCT CAC CGA TTC CGC CAC CTG AAA TAG 

t 

Ddel 



Pro Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp 

CCT GTG GAG AAC CTA GAG ACA ACC ATG AGG TCC CCG GTG TTC ACG GAT 
GGA CAC CTC TTG GAT CTC TGT TGG TAC TCC AGG GGC CAC AAG TGC CTA 



Asn Ser Ser Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu 

AAC TCC TCT CCA CCA GTA GTG CCC CAG AGC TTC CAG GTG GCT CAC CTC 
TTG AGG AGA GGT GGT CAT CAC GGG GTC TCG AAG GTC CAC CGA GTG GAG 

T 

ECORU 



His Ala Pro Arg lie 
CAT GCT CCC CGA ATT C 
GTA CGA GGG GCT TAA G 

t 

EcoRI 



Figure 3 
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Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 

CCC TGC ACT TGC GGC TCC TCG GAC CTT TAC CTG GTC ACG AGG CAC GCC 
GGG ACG TGA ACG CCG AGG AGC CTG GAA ATG GAC CAG TGC TCC GTG CGG 



Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 

GAT GTC ATT CCC GTG CGC CGG CGG GGT GAT AGC AGG GGC AGC CTG CTG 
CTA CAG TAA GGG CAC GCG GCC GCC CCA CTA TCG TCC CCG TCG GAC GAC 



Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 

TCG CCC CGG CCC ATT TCC TAC TTG AAA GGC TCC TCG GGG GGT CCG CTG 
AGC GGG GCC GGG TAA AGG ATG AAC TTT CCG AGG AGC CCC CCA GGC GAC 



Leu Cys Pro Ala Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys 
TTG TGC CCC GCG GGG CAC GCC GTG GGC ATA TTT AGG GCC GCG GTG TGC 
AAC ACG GGG CGC CCC GTG CGG CAC CCG TAT AAA TCC CGG CGC CAC ACG 



Thr Arg Gly Val Ala Lys Ala Val Asp Phe He Pro Val Glu Asn Leu 

ACC CGT GGA GTG GCT AAG GCG GTG GAC TTT ATC CCT GTG GAG AAC CTA 
TGG GCA CCT CAC CGA TTC CGC CAC CTG AAA TAG GGA CAC CTC TTG GAT 



t 

Ddel 



Glu Thr 
GAG ACA 
CTC TGT 



Thr Met Arg Ser 
ACC ATG AGG TCC 
TGG TAC TCC AGG 



Pro 
CCG 
GGC 



Val Phe Thr Asp Asn 
GTG TTC ACG GAT AAC 
CAC AAG TGC CTA TTG 



Ser 
TCC TC 
AGG AG 



Figure 4 
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lie Arg Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp 
ATT CGG GGC ACC TAT GTT TAT AAC CAT CTC ACT CCT CTT CGG GAC TGG 
TAA GCC CCG TGG ATA CAA ATA TTG GTA GAG TGA GGA GAA GCC CTG ACC 
t 



Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val 

GCG CAC AAC GGC TTG CGA GAT CTG GCC GTG GCT GTA GAG CCA GTC GTC 
CGC GTG TTG CCG AAC GCT CTA GAC CGG CAC CGA CAT CTC GGT CAG CAG 



Phe Ser Gin Met Glu Thr Lys Leu lie Thr Trp Gly Ala Asp Thr Ala 
TTC TCC CAA ATG GAG ACC AAG CTC ATC ACG TGG GGG GCA GAT ACC GCC 
AAG AGG GTT TAC CTC TGG TTC GAG TAG TGC ACC CCC CGT CTA TGG CGG 



Ala Cys Gly Asp lie lie Asn Gly Leu Pro Val Ser Ala Arg Arg Gly 
GCG TGC GGT GAC ATC ATC AAC GGC TTG CCT GTT TCC GCC CGC AGG GGC 
CGC ACG CCA CTG TAG TAG TTG CCG AAC GGA CAA AGG CGG GCG TCC CCG 



Arg Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp 

CGG GAG ATA CTG CTC GGG CCA GCC GAT GGA ATG GTC TCC AAG GGT TGG 
GCC CTC TAT GAC GAG CCC GGT CGG CTA CCT TAC CAG AGG TTC CCA ACC 



Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu 

AGG TTG CTG GCG CCC ATC ACG GCG TAC GCC CAG CAG ACA AGG GGC CTC 
TCC AAC GAC CGC GGG TAG TGC CGC ATG CGG GTC GTC TGT TCC CCG GAG 



Leu Gly Cys lie lie Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val 

CTA GGG TGC ATA ATC ACC AGC CTA ACT GGC CGG GAC AAA AAC CAA GTG 
GAT CCC ACG TAT TAG TGG TCG GAT TGA CCG GCC CTG TTT TTG GTT CAC 



Glu Gly Glu Val Gin lie Val Ser Thr Ala Ala Gin Thr Phe Leu Ala 

GAG GGT GAG GTC CAG ATT GTG TCA ACT GCT GCC CAA ACC TTC CTG GCA 
CTC CCA CTC CAG GTC TAA CAC AGT TGA CGA CGG GTT TGG AAG GAC CGT 



Thr Cys lie Asn Gly Val Cys Trp Pro Asn 
ACG TGC ATC AAT GGG GTG TGC TGG CCG AAT TC 
TGC ACG TAG TTA CCC CAC ACG ACC GGC TTA AG 



EcoRI 



t 

SfaNI 



t 

EcoRI 
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£21: 






























Glu 


Phe 


Gly 


Ser 


Val 


lie 


Pro 


Thr Ser Gly Asp 


Val 


Val 


Val 


Val 


Ala 


GAA 


TTC 


GGG 


TCC 


GTC 


ATC 


CCG 


ACC 


AGC 


GGC 


GAT 


GTT 


GTC 


GTC 


GTC 


GCA 


CTT 


AAG 


CCC 


AGG 


CAG 


TAG 


GGC 


TGG 


TCG 


CCG 


CTA 


CAA 


CAG 


CAG 


CAG 


CGT 


T 

EcoRI 






























Thr Asp 


Ala 




Met: 


Thr 


Glv 


Tyr Thr Gly Asp 


Phe 


Asp 


Ser 


Val 


He 


ACC 


GAT 


GCC 


CTC 


ATG 


ACC 


GGC 


TAT 


ACC 


GGC 


GAC 


TTC 


GAC 


TCG 


GTG 


ATA 




CTA 


CGG 


GAG 


TAC 


TGG 


CCG 


ATA 


TGG 


CCG 


CTG 


AAG 


CTG 


AGC 


CAC 


TAT 
























Hinfl 






Asp 


Cys 


Asn 


Thr 


cys 


val 


Thr 


Gin 


Thr 


Val 


Asp 


Phe 


Ser 


Leu Asp 


Pro 


GAC 


TGC 


AAT 


ACG 


TGT 


GTC 


ACC 


CAG 


ACA 


GTC 


GAT 


TTC 


AGC CTT GAC 


CCT 


CTG 


ACG 


TTA 


TGC 


ACA 


CAG 


TGG 


GTC 


TGT 


CAG 


CTA 


AAG 


TCG 


GAA 


CTG 


GGA 


Thr 


Phe 


Thr 


lie 


Glu 


Thr 


He 


Thr 


Leu 


Pro 


Gin 


Asp 


Ala 


val 


Ser 


Arg 


ACC 


TTC 


ACC 


ATT 


GAG 


ACA 


ATC 


ACG 


CTC 


CCC 


CAA 


GAT 


GCT 


GTC 


TCC 


CGC 


TGG 


AAG 


TGG 


TAA 


CTC 


TGT 


TAG 


TGC 


GAG 


GGG 


GTT 


CTA 


CGA 


CAG 


AGG 


GCG 


Thr 


Gin 


Kim 




GlV 


A. rex 


Thr 


Gly Arg Gly Lys 


Pro 


Gly 


He Tyr Arg 


ACT 


CAA 


CGT 


CGG 


GGC 


AGG 


ACT 


GGC 


AGG 


GGG 


AAG 


CCA 


GGC 


ATC 


TAC 


AGA 


TGA 


GTT 


GCA 


GCC 


CCG 


TCC 


TGA 


CCG 


TCC 


CCC 


TTC 


GGT 


CCG 


TAG 


ATG 


TCT 


Phe 


val 


Ala 


Pro 


Gly 


Glu 


Arg 


Pro Ser Gly Met 


Phe 


Asp 


Ser 


Ser 


Val 


TTT 


GTG 


GCA 


CCG 


GGG 


GAG 


CGC 


CCC 


TCC 


GGC 


ATG 


TTC 


GAC 


TCG 


TCC 


GTC 


AAA 


CAC 


CGT 


GGC 


CCC 


CTC 


GCG 


GGG 


AGG 


CCG 


TAC 


AAG 


CTG 


AGC 


AGG 


CAG 



t t 

Bgll Hinfl 



Leu Cys Glu Cys Pro Asn 

CTC TGT GAG TGC CCG AAT TC 
GAG ACA CTC ACG GGC TTA AG 

t 

£coRI 



Figure & 
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He Arg Ser He Glu Thr He Thr Leu Pro Gin Asp Ala Val Ser Arg 

ATT CGG TCC ATT GAG ACA ATC ACG CTC CCC CAG GAT GCT GTC TCC CGC 
TAA GCC AGG TAA CTC TGT TAG TGC GAG GGG GTC CTA CGA CAG AGG GCG 
T 

ECORI 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 

ACT CAA CGT CGG GGC AGG ACT GGC AGG. GGG AAG CCA GGC ATC TAC AGA 
TGA GTT GCA GCC CCG TCC TGA CCG TCC CCC TTC GGT CCG TAG ATG TCT 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 

TTT GTG GCA CCG GGG GAG CGC CCC TCC GGC ATG TTC GAC TCG TCC GTC 
AAA CAC CGT GGC CCC CTC GCG GGG AGG CCG TAC AAG CTG AGC AGG CAG 

t 

Bgll 

Leu cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 

CTC TGT GAG TGC TAT GAC GCA GGC TGT GCT TGG TAT GAG CTC ACG CCC 
GAG ACA CTC ACG ATA CTG CGT CCG ACA CGA ACC ATA CTC GAG TGC GGG 



Ala 


Glu 


Thr 


Thr 


Val 


Arg 


Leu 


Arg 


GCC 


GAG 


ACT 


ACA 


GTT 


AGG 


CTA 


CGA 


CGG 


CTC 


TGA 


TGT 


CAA 


TCC 


GAT 


GCT 


Pro 


Val 


Cys Gin Asp His 


Leu 


Glu 


CCC 


GTG 


TGC 


CAG 


GAC 


CAT 


CTT 


GAA 


GGG 


CAC 


ACG 


GTC 


CTG 


GTA 


GAA 


CTT 


Leu 


Thr 


His 


He Asp Ala 


His 


Phe 


CTC 


ACT 


CAT 


ATA 


GAT 


GCC 


CAC 


TTT 


GAG 


TGA 


GTA 


TAT 


CTA 


CGG 


GTG 


AAA 


Glu 


Asn 


Leu Pro Tyr Leu 


Val 


Ala 


GAG 


AAC 


CTT 


CCT 


TAC 


CTG 


GTA 


GCG 


CTC 


TTG 


GAA 


GGA 


ATG 


GAC 


CAT 


CGC 


Ala 


Gin 


Ala 


Pro 


Pro 


Pro 


Ser 


Trp 


GCT 


CAA 


GCC 


CCT 


CCC 


CCA 


TCG 


TGG 


CGA 


GTT 


CGG 


GGA 


GGG 


GGT 


AGC 


ACC 


Arg 


Leu 


Lys 


Pro 


Thr 


Leu 


His 


Gly 


CGC 


CTC 


AAG 


CCC 


ACC 


CTC 


CAT 


GGG 


GCG 


GAG 


TTC 


GGG 


TGG 


GAG 


GTA 


CCC 



Ala Tyr Met Asn Thr Pro Gly Leu 
GCG TAC ATG AAC ACC CCG GGG CTT 
CGC ATG TAC TTG TGG GGC CCC GAA 

Phe Trp Glu Gly Val Phe Thr Gly 

TTT TGG GAG GGC GTC TTT ACA GGC 
AAA ACC CTC CCG CAG AAA TGT CCG 

Leu Ser Gin Thr Lys Gin Ser Gly 

CTA TCC CAG ACA AAG CAG AGT GGG 
GAT AGG GTC TGT TTC GTC TCA CCC 

Tyr Gin Ala Thr Val Cys Ala Arg 

TAC CAA GCC ACC GTG TGC GCT AGG 
ATG GTT CGG TGG CAC ACG CGA TCC 

Asp Gin Met Trp Lys Cys Leu He 

GAC CAG ATG TGG AAG TGT TTG ATT 
CTG GTC TAC ACC TTC ACA AAC TAA 

Pro Thr Pro Leu Leu Tyr Arg Leu 

CCA ACA CCC CTG CTA TAC AGA CTG 
GGT TGT GGG GAC GAT ATG TCT GAC 



Gly Ala Ala Glu Phe 

GGC GCT GCC GAA TTC 
CCG CGA CGG CTT AAG 

t 

EcoRI 
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C33<?: 

Glu Phe Gly Ala Val Asp Phe lie Pro Val Glu Asn Leu Glu Thr Thr 

GAA TTC GGG GCG GTG GAC TTT ATC CCT GTG GAG AAC CTA GAG ACA ACC 
CTT AAG CCC CGC CAC CTG AAA TAG GGA CAC CTC TTG GAT CTC TGT TGG 
t 

EcoRI 

Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Val Val Pro 

ATG AGG TCC CCG GTG TTC ACG GAT AAC TCC TCT CCA CCA GTA GTG CCC 
TAC TCC AGG GGC CAC AAG TGC CTA TTG AGG AGA GGT GGT CAT CAC GGG 



■ 

Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser Gly Lys 
CAG AGC TTC CAG GTG GCT CAC CTC CAT GCT CCC ACA GGC AGC GGC AAA 
GTC TCG AAG GTC CAC CGA GTG GAG GTA CGA GGG TGT CCG TCG CCG TTT 



Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys Val Leu 

AGC ACC AAG GTC CCG GCT GCA TAT GCA GCT CAG GGC TAT AAG GTG CTA 
TCG TGG TTC CAG GGC CGA CGT ATA CGT CGA GTC CCG ATA TTC CAC GAT 



Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala Tyr Met 

GTA CTC AAC CCC TCT GTT GCT GCA ACA CTG GGC TTT GGT GCT TAC ATG 
CAT GAG TTG GGG AGA CAA CGA CGT TGT GAC CCG AAA CCA CGA ATG TAC 



Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly Val Arg Thr 

TCC AAG GCT CAT GGG ATC GAT CCT AAC ATC AGG ACC GGG GTG AGA ACA 
AGG TTC CGA GTA CCC TAG CTA GGA TTG TAG TCC TGG CCC CAC TCT TGT 



He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr Gly Lys Phe Leu 

ATT ACC ACT GGC AGC CCC ATC ACG TAC TCC ACC TAC GGC AAG TTC CTT 
TAA TGG TGA CCG TCG GGG TAG TGC ATG AGG TGG ATG CCG TTC AAG GAA 



Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He Cys Asp 

GCC GAC GGC GGG TGC TCG GGG GGC GCT TAT GAC ATA ATA ATT TGT GAC 
CGG CTG CCG CCC ACG AGC CCC CCG CGA ATA CTG TAT TAT TAA ACA CTG 



Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly He Gly Thr Val 

GAG TGC CAC TCC ACG GAT GCC ACA TCC ATC TTG GGC ATT GGC ACT GTC 
CTC ACG GTG AGG TGC CTA CGG TGT AGG TAG AAC CCG TAA CCG TGA CAG 



Figure S 
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Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu 

CTT GAC CAA GCA GAG ACT GCG GGG GCG AGA CTG GTT GTG CTC 
GAA CTG GTT CGT CTC TGA CGC CCC CGC TCT GAC CAA CAC GAG 



Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn He 

GCC ACC CCT CCG GGC TCC GTC ACT GTG CCC CAT CCC AAC ATC 
CGG TGG GGA GGC CCG AGG CAG TGA CAC GGG GTA GGG TTG TAG 



Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr Gly Lys 

GTT GCT CTG TCC ACC ACC GGA GAG ATC CCT TTT TAC GGC AAG 
CAA CGA GAC AGG TGG TGG CCT CTC TAG GGA AAA ATG CCG TTC 



Pro Leu Glu Val He Lys Gly Gly Arg His Leu He Phe Cys 
CCC CTC GAA GTA ATC AAG GGG GGG AGA CAT CTC ATC TTC TGT 
GGG GAG CTT CAT TAG TTC CCC CCC TCT GTA GAG TAG AAG ACA 



Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val Ala Leu 

AAG AAG AAG TGC GAC GAA CTC GCC GCA AAG CTG GTC GCA TTG 
TTC TTC TTC ACG CTG CTT GAG CGG CGT TTC GAC CAG CGT AAC 



Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He 

AAT GCC GTG GCC TAC TAC CGC GGT CTT GAC GTG TCC GTC ATC 
TTA CGG CAC CGG ATG ATG GCG CCA GAA CTG CAC AGG CAG TAG 



ser Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr 
AGC GGC GAT GTT GTC GTC GTG GCA ACC GAT GCC CTC ATG ACC 
TCG CCG CTA CAA CAG CAG CAC CGT TGG CTA CGG GAG TAC TGG 



Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Ala 

ACC GGC GAC TTC GAC TCG GTG ATA GAC TGC AAT ACG TGT GCC 
TGG CCG CTG AAG CTG AGC CAC TAT CTG ACG TTA TGC ACA CGG 



Ala Thr 

GCC ACC 
CGG TGG 



Glu Glu 
GAG GAG 
CTC CTC 



Ala He 
GCT ATC 
CGA TAG 



His Ser 
CAT TCA 
GTA AGT 



Gly He 
GGC ATC 
CCG TAG 



Pro Thr 
CCG ACC 
GGC TGG 



Gly Tyr 
GGC TAT 
CCG ATA 



GlU Phe 
GAA TTC 
CTT AAG 
t 



t 

Hlnfl 



EcoRI 
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Bgll 
4 

Hinfl ======^^ C35 

4 4 
==============«==^^ C31 

4 

... C33C 



T 

C200 



SfaNI 
4 



C7f 



C20C 



C7f+C20c 



Ddel 
4 

4 4 

C26d ===== 
C33C 



T 

C300 



WO 91/15596 



16/23 



PCT/US91/02209 



C7f+C20c 



Nael 
t 



C300 



C7fC20cC300 



Ndel 
i 



C200 
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-155 -150 

Met Ala Thr Asn Pro Val cys Val Leu 
ATG GCT ACA AAC CCT GTT TGC GTT TTG 
TAC CGA TGT TTG GGA CAA ACG CAA AAC 

-145 -140 -135 

Lys Gly Asp Gly Pro Val Gin Gly lie He Asn Phe Glu Gin Lys Glu 

AAG GGT GAC GGC CCA GTT CAA GGT ATT ATT AAC TTC GAG CAG AAG GAA 
TTC CCA CTG CCG GGT CAA GTT CCA TAA TAA TTG AAG CTC GTC TTC CTT 

-130 -125 -120 -us 

Ser Asn Gly Pro Val Lys Val Trp Gly Ser He Lys Gly Leu Thr Glu 

AGT AAT GGA CCA GTG AAG GTG TGG GGA AGC ATT AAA GGA CTG ACT GAA 
TCA TTA CCT GGT CAC TTC CAC ACC CCT TCG TAA TTT CCT GAC TGA CTT 

-110 -105 -loo 

Gly Leu His Gly Phe His Val His Glu Phe Gly Asp Asn Thr Ala Gly 
GGC CTG CAT GGA TTC CAT GTT CAT GAG TTT GGA GAT AAT ACA GCA GGC 
CCG GAC GTA CCT AAG GTA CAA GTA CTC AAA CCT CTA TTA TGT CGT CCG 

-95 -90 -85 

Cys Thr Ser Pro Gly Pro His Phe Asn Pro Leu Ser Arg Lys His Gly 

TGT ACC AGT CCA GGT CCT CAC TTT AAT CCT CTA TCC AGA AAA CAC GGT 
ACA TGG TCA GGT CCA GGA GTG AAA TTA GGA GAT AGG TCT TTT GTG CCA 

-80 -75 -70 

Gly Pro Lys Asp Glu Glu Arg His Val Gly Asp Leu Gly Asn Val Thr 

GGG CCA AAG GAT GAA GAG AGG CAT GTT GGA GAC TTG GGC AAT GTG ACT 
CCC GGT TTC CTA CTT CTC TCC GTA CAA CCT CTG AAC CCG TTA CAC TGA 

-65 -60 -55 

Ala Asp Lys Asp Gly Val Ala Asp Val Ser He Glu Asp Ser Val lie 
GCT GAC AAA GAT GGT GTG GCC GAT GTG TCT ATT GAA GAT TCT GTG ATC 
CGA CTG TTT CTA CCA CAC CGG CTA CAC AGA TAA CTT CTA AGA CAC TAG 

-50 -45 -40 -35 

Ser Leu Ser Gly Asp His Cys He He Gly Arg Thr Leu Val Val His 
TCA CTC TCA GGA GAC CAT TGC ATC ATT GGC CGC ACA CTG GTG GTC CAT 
AGT GAG AGT CCT CTG GTA ACG TAG TAA CCG GCG TGT GAC CAC CAG GTA 

-30 -25 -20 

Glu Lys Ala Asp Asp Leu Gly Lys Gly Gly Asn Glu Glu Ser Thr Lys 

GAA AAA GCA GAT GAC TTG GGC AAA GGT GGA AAT GAA GAA AGT ACA AAG 
CTT TTT CGT CTA CTG AAC CCG TTT CCA CCT TTA CTT CTT TCA TGT TTC 

-15 -10 -5 

Thr Gly Asn Ala Gly Ser Arg Leu Ala Cys Gly Val He Gly He Arg 
ACA GGA AAC GCT GGA AGT CGT TTG GCT TGT GGT GTA ATT GGG ATC CGA 
TGT CCT TTG CGA CCT TCA GCA AAC CGA ACA CCA CAT TAA CCC TAG GCT 



IL O 
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1 5 10 

Arg lie Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp 

ATT CGG GGC ACC TAT GTT TAT AAC CAT CTC ACT CCT CTT CGG GAC TGG 
TAA GCC CCG TGG ATA CAA ATA TTG GTA GAG TGA GGA GAA GCC CTG ACC 



15 20 25 30 

Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val 

GCG CAC AAC GGC TTG CGA GAT CTG GCC GTG GCT GTA GAG CCA GTC GTC 
CGC GTG TTG CCG AAC GCT CTA GAC CGG CAC CGA CAT CTC GGT CAG CAG 



Phe Ser Gin Met 
TTC TCC CAA ATG 
AAG AGG GTT TAC 



35 

Glu Thr Lys Leu 
GAG ACC AAG CTC 
CTC TGG TTC GAG 



40 

lie Thr Trp Gly 
ATC ACG TGG GGG 
TAG TGC ACC CCC 



45 

Ala Asp Thr Ala 
GCA GAT ACC GCC 
CGT CTA TGG CGG 



50 

Ala Cys Gly Asp 

GCG TGC GGT GAC 
CGC ACG CCA CTG 



lie lie Asn Gly 

ATC ATC AAC GGC 
TAG TAG TTG CCG 



55 

Leu Pro Val Ser 

TTG CCT GTT TCC 
AAC GGA CAA AGG 



60 

Ala Arg Arg Gly 

GCC CGC AGG GGC 
CGG GCG TCC CCG 



65 70 75 

Arg Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp 
CGG GAG ATA CTG CTC GGG CCA GCC GAT GGA ATG GTC TCC AAG GGT TGG 
GCC CTC TAT GAC GAG CCC GGT CGG CTA CCT TAC CAG AGG TTC CCA ACC 



80 85 90 

Arg Leu" Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu 

AGG TTG CTG GCG CCC ATC ACG GCG TAC GCC CAG CAG ACA AGG GGC CTC 
TCC AAC GAC CGC GGG TAG TGC CGC ATG CGG GTC GTC TGT TCC CCG GAG 



95 100 

Leu Gly Cys lie lie Thr Ser Leu 

CTA GGG TGC ATA ATC ACC AGC CTA 
GAT CCC ACG TAT TAG TGG TCG GAT 



105 110 

Thr Gly Arg Asp Lys Asn Gin Val 

ACT GGC CGG GAC AAA AAC CAA GTG 
TGA CCG GCC CTG TTT TTG GTT CAC 



115 120 125 

Glu Gly Glu Val Gin lie Val Ser Thr Ala Ala Gin Thr Phe Leu Ala 
GAG GGT GAG GTC CAG ATT GTG TCA ACT GCT GCC CAA ACC TTC CTG GCA 
CTC CCA CTC CAG GTC TAA CAC AGT TGA CGA CGG GTT TGG AAG GAC CGT 

Figure O. O C continued ) 
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130 

Thr cys lie lie 

ACG TGC ATC ATC 
TGC ACG TAG TAG 



145 

Thr Arg Thr He 

ACG AGG ACC ATC 
TGC TCC TGG TAG 



160 

Asn val Asp Gin 

AAT GTA GAC CAA 
TTA CAT CTG GTT 



175 

Ser Leu Thr Pro 

TCA TTG ACA CCC 
AGT AAC TGT GGG 



Arg His Ala Asp 

AGG CAC GCC GAT 
TCC GTG CGG CPA 



210 

Ser Leu Leu Ser 

AGC CTG CTG TCG 
TCG GAC GAC AGC 



225 

Gly Pro Leu Leu 

GGT CCG CTG TTG 
CCA GGC GAC AAC 



240 

Ala Val Cys Thr 

GCG GTG TGC ACC 
CGC CAC ACG TGG 



Asn Gly Val Cys 

AAT GGG GTG TGC 
TTA CCC CAC ACG 



150 

Ala Ser Pro Lys 

GCG TCA CCC AAG 
CGC AGT GGG TTC 



165 

Asp Leu Val Gly 

GAC CTT GTG GGC 
CTG GAA CAC CCG 



180 

Cys Thr Cys Gly 

TGC ACT TGC GGC 
ACG TGA ACG CCG 



195 

Val He Pro Val 
GTC ATT CCC GTG 
CAG TAA GGG CAC 



Pro Arg Pro He 

CCC CGG CCC ATT 
GGG GCC GGG TAA 



230 

Cys Pro Ala Gly 

TGC CCC GCG GGG 
ACG GGG CGC CCC 



245 

Arg Gly Val Ala 

CGT GGA GTG GCT 
GCA CCT CAC CGA 



& 3- O 



135 



Trp 

TGG 
ACC 


Thr 

ACT 
TGA 


Val 

GTC 
CAG 


Tyr 

TAC 
ATG 


Gly 

CCA. 


Pro 
GGA 


Val 

CAG 


He 

ATC 
TAG 


Trp 

TGG 
ACC 


Pro 

GGG 


Ala 

GCT 
CGA 


170 

Ser 

TCG 

AGC 


Ser 
TCC 
AGG 


Ser 
TCG 
AGC 


185 
Asp 
GAC 
CTG 


Leu 
CTT 
GAA 


Arg 

CGC 
GCG 


200 
Arg 
CGG 
GCC 


Arg 

CGG 
GCC 


Gly 
GGT 
CCA 


■ 

Nael 






215 
Ser 

TCC 
AGG 


Tyr 
TAC 
ATG 


Leu 
TTG 
AAC 


Lys 

AAA 
TTT 


His 
CAC 
GTG 


Ala 
GCC 
CGG 


Val 

GTG 
CAC 


Gly 

GGC 

CCG 


Lys 
AAG 
TTC 


Ala 

GCG 
CGC 


Val 
GTG 
CAC 


250 
Asp 
GAC 
CTG 




on 







140 

His' Gly Ala Gly 

CAC GGG GCC GGA 
GTG CCC CGG CCT 



155 

Gin Met Tyr Thr 

CAG ATG TAT ACC 
GTC TAC ATA TGG 



Gin Gly Thr Arg 

CAA GGT ACC CGC 
GTT CCA TGG GCG 



190 

Tyr Leu Val Thr 

TAC CTG GTC ACG 
ATG GAC CAG TGC 



205 

Asp Ser Arg Gly 

GAT AGC AGG GGC 
CTA TCG TCC CCG 



220 

Gly Ser Ser Gly 

GGC TCC TCG GGG 
CCG AGG AGC CCC 



235 

He Phe Arg Ala 

ATA TTT AGG GCC 
TAT AAA TCC CGG 



Phe He Pro Val 

TTT ATC CCT GTG 
AAA TAG GGA CAC 
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253 260 265 2j0 

Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser 
GAG AAC CTA GAG ACA ACC ATG AGG TCC CCG GTG TTC ACG GAT AAC TCC 
CTC TTG GAT CTC TGT TGG TAC TCC AGG GGC CAC AAG TGC CTA TTG AGG 



275 280 285 

Ser Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala 

TCT CCA CCA GTA GTG CCC CAG AGC TTC CAG GTG GCT CAC CTC CAT GCT 
AGA GGT GGT CAT CAC GGG GTC TCG AAG GTC CAC CGA GTG GAG GTA CGA 



290 295 300 

Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala 

CCC ACA GGC AGC GGC AAA AGC ACC AAG GTC CCG GCT GCA TAT GCA GCT 
GGG TGT CCG TCG CCG TTT TCG TGG TTC CAG GGC CGA CGT ATA CGT CGA 

t 

Ndel 

305 310 315 

Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu 

CAG GGC TAT AAG GTG CTA GTA CTC AAC CCC TCT GTT GCT GCA ACA CTG 
GTC CCG ATA TTC CAC GAT CAT GAG TTG GGG AGA CAA CGA CGT TGT GAC 



320 325 330 

Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly lie Asp Pro Asn lie 

GGC TTT GGT GCT TAC ATG TCC AAG GCT CAT GGG ATC GAT CCT AAC ATC 
CCG AAA CCA CGA ATG TAC AGG TTC CGA GTA CCC TAG CTA GGA TTG TAG 



335 340 345 350 

Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr ser 

AGG ACC GGG GTG AGA ACA ATT ACC ACT GGC AGC CCC ATC ACG TAC TCC 
TCC TGG CCC CAC TCT TGT TAA TGG TGA CCG TCG GGG TAG TGC ATG AGG 



355 360 365 

Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr 

ACC TAC GGC AAG TTC CTT GCC GAC GGC GGG TGC TCG GGG GGC GCT TAT 
TGG ATG CCG TTC AAG GAA CGG CTG CCG CCC ACG AGC CCC CCG CGA ATA 



370 375 380 

Asp lie lie lie Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser lie 

GAC ATA ATA ATT TGT GAC GAG TGC CAC TCC ACG GAT GCC ACA TCC ATC 

CTG TAT TAT TAA ACA CTG CTC ACG GTG AGG TGC CTA CGG TGT AGG TAG 



Figure XO C continued > 
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385 390 395 

Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg 

TTG GGC ATT GGC ACT GTC CTT GAC CAA GCA GAG ACT GCG GGG GCG AGA 
AAC CCG TAA CCG TGA CAG GAA CTG GTT CGT CTC TGA CGC CCC CGC TCT 



400 405 410 

Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro 

CTG GTT GTG CTC GCC ACC GCC ACC CCT. CCG GGC TCC GTC ACT GTG CCC 
GAC CAA CAC GAG CGG TGG CGG TGG GGA GGC CCG AGG CAG TGA CAC GGG 



415 420 
His Pro Asn He Glu Glu Val Ala 
CAT CCC AAC ATC GAG GAG GTT GCT 
GTA GGG TTG TAG CTC CTC CAA CGA 



425 430 

Leu Ser Thr Thr Gly Glu He Pro 
CTG TCC ACC ACC GGA GAG ATC CCT 
GAC AGG TGG TGG CCT CTC TAG GGA 



Phe Tyr Gly Lys 

TTT TAC GGC AAG 
AAA ATG CCG TTC 



435 

Ala He Pro Leu 
GCT ATC CCC CTC 
CGA TAG GGG GAG 



440 

Glu Val He Lys 
GAA GTA ATC AAG 
CTT CAT TAG TTC 



445 

Gly Gly Arg His 
GGG GGG AGA CAT 
CCC CCC TCT GTA 



450 

Leu He Phe Cys 

CTC ATC TTC TGT 
GAG TAG AAG ACA 



His Ser Lys Lys 

CAT TCA AAG AAG 
GTA AGT TTC TTC 



455 

Lys Cys Asp Glu 

AAG TGC GAC GAA 
TTC ACG CTG CTT 



460 

Leu Ala Ala Lys 

CTC GCC GCA AAG 
GAG CGG CGT TTC 



465 

Leu Val Ala Leu 
CTG GTC GCA TTG 
GAC CAG CGT AAC 



470 

Gly He Asn Ala 
GGC ATC AAT GCC 
CCG TAG TTA CGG 



Val Ala Tyr Tyr 
GTG GCC TAC TAC 
CAC CGG ATG ATG 



475 

Arg Gly Leu Asp 
CGC GGT CTT GAC 
GCG CCA GAA CTG 



480 

Val Ser Val He 
GTG TCC GTC ATC 
CAC AGG CAG TAG 



485 

Pro Thr Ser Gly 

CCG ACC AGC GGC 
GGC TGG TCG CCG 



490 

Asp Val Val Val 

GAT GTT GTC GTC 
CTA CAA CAG CAG 



Val Ala Thr Asp 

GTG GCA ACC GAT 
CAC CGT TGG CTA 
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495 

Ala Leu Met Thr 
GCC CTC ATG ACC 
CGG GAG TAC TGG 



500 

Gly Tyr Thr Gly 

GGC TAT ACC GGC 
CCG ATA TGG CCG 



505 

Asp Phe Asp Ser 

GAC TTC GAC TCG 
CTG AAG CTG AGC 



510 

Val lie Asp cys 

GTG ATA GAC TGC 
CAC TAT CTG ACG 



Asn Thr Cys Val 

AAT ACG TGT GTC 
TTA TGC ACA CAG 



515 

Thr Gin Thr Val 

ACC CAG ACA GTC 
TGG GTC TGT CAG 



520 

Asp Phe Ser Leu 
GAT TTC AGC CTT 
CTA AAG TCG GAA 



525 

Asp Pro Thr Phe 

GAC CCT ACC TTC 
CTG GGA TGG AAG 



530 

Thr lie Glu Thr lie Thr Leu Pro 
ACC ATT GAG ACA ATC ACG CTC CCC 
TGG TAA CTC TGT TAG TGC GAG GGG 



535 540 

Gin Asp Ala Val Ser Arg Thr Gin 
CAA GAT GCT GTC TCC CGC ACT CAA 
GTT CTA CGA CAG AGti GCG TGA GTT 



545 

Arg Arg Gly Arg 

CGT CGG GGC AGG 
GCA GCC CCG TCC 



550 

Thr Gly Arg Gly 

ACT GGC AGG GGG 
TGA CCG TCC CCC 



Lys Pro Gly He 
AAG CCA GGC ATC 
TTC GGT CCG TAG 



555 

Tyr Arg Phe Val 

TAC AGA TTT GTG 
ATG TCT AAA CAC 



560 

Ala Pro Gly Glu 

GCA CCG GGG GAG 
CGT GGC CCC CTC 



565 

Arg Pro Pro Gly 

CGC CCT CCC GGC 
GCG GGA GGG CCG 



570 

Met Phe Asp Ser 

ATG TTC GAC TCG 
TAC AAG CTG AGC 



Ser Val Leu Cys 

TCC GTC CTC TGT 
AGG CAG GAG ACA 



575 

Glu Cys Tyr Asp 

GAG TGC TAT GAC 
CTC ACG ATA CTG 



580 

Ala Gly Cys Ala 

GCA GGC TGT GCT 
CGT CCG ACA CGA 



585 

Trp Tyr Glu Leu 

TGG TAT GAG CTC 
ACC ATA CTC GAG 



590 

Thr Pro Ala Glu 

ACG CCC GCC GAG 
TGC GGG CGG CTC 



Thr Thr Val Arg 

ACT ACA GTT AGG 
TGA TGT CAA TCC 



595 

Leu Arg Ala Tyr 

CTA CGA GCG TAC 
GAT GCT CGC ATG 



600 

Met Asn Thr Pro 

ATG AAC ACC CCG 
TAC TTG TGG GGC 



605 

Gly Leu Pro Val 
GGG CTT CCC GTG 
CCC GAA GGG CAC 
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610 

Cys Gin Asp His Leu Glu Phe Trp 

TGC CAG GAC CAT CTT GAA TTT TGG 
ACG GTC CTG GTA GAA CTT AAA ACC 



615 






620 




GlU 


Gly 


Val 


Phe Thr Gly 


Leu 


GAG 


GGC 


GTC 


TTT ACA GGC 


CTC 


CTC 


CCG 


CAG 


AAA TGT CCG 


GAG 



Thr 
ACT 
TGA 



625 








630 










635 






Asp Ala 


His 


Phe 


Leu 


Ser 


Gin 


Thr 


Lys 


Gin 


Ser 


Gly 


GlU 


GAT GCC 


CAC 


TTT 


CTA 


TCC 


CAG 


ACA 


AAG 


CAG 


AGT 


GGG 


GAG 


CTA CGG 


GTG 


AAA 


GAT 


AGG 


GTC 


TGT 


TTC 


GTC 


TCA 


CCC 


CTC 



640 645 650 

Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin 
CTT CCT TAG CTG GTA GCG TAC CAA GCC ACC GTG TGC GCT AGG GCT CAA 
GAA GGA ATG GAC CAT CGC ATG GTT CGG TGG CAC ACG CGA TCC CGA GTT 



655 660 665 670 

Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie Arg Leu 

GCC CCT CCC CCA TCG TGG GAC CAG ATG TGG AAG TGT TTG ATT CGC CTC 
CGG GGA GGG GGT AGC ACC CTG GTC TAC ACC TTC ACA AAC TAA GCG GAG 



Lys Pro Thr Leu 

AAG CCC ACC CTC 
TTC GGG TGG GAG 



675 

His Gly Pro Thr 

CAT GGG CCA ACA 
GTA CCC GGT TGT 





680 






Pro 


Leu 


Leu 


Tyr 


CCC 


CTG 


CTA 


TAC 


GGG 


GAC 


GAT 


ATG 



685 

Arg Leu Gly Ala 

AGA CTG GGC GCT 
TCT GAC CCG CGA 



Figure lO (continued) 



INTERNATIONAL SEARCH REPORT 

International Application No 



PCT/US 91/02209 




IPC 



C 12 Q,G 01 N 33/00 





P,A 



CRC Critical Reviews in 

Biotechnology, vol. 8, issue 
2, 1988, B.D. Korant et al. 
"Viral Proteases: An Emer- 
ging Therapeutic Target" , 
pages 149-157, see pages 
153, table 4 - page 154, line 
41. 

Science, vol. 247, January 

26, 1990 (26.01.90), T.J* 
McQuade et al. "A synthetic 
HIV-1 Protease inhibitor 
with antiviral activity 
arrests HIV-like particle 
maturation", pages 454-456, 
see abstract* 

EP, A2, 0 414 475 

(CHIRON CORPORATION) 

27 February 1991 (27.02.91), 

see page 15, lines 4-17 and 



• Special catagooea of dud document! : " 

•A" doorman! defining the general lni * n wWch ** 001 

considered to be el particular relevance 
•C" oarlier document but pubflehed on or after the taternauonei 

Ming dtt« 

T- document which mer throw doutrt* <k» priojtty ^J 1 "^^ 
which U cHed to eatabueh the publication date of another 
citation or other * pedal reeeon U« apeaned) 

-O- document referring to an oral diedoeure. uee, eahtbttlon or 
other means 

-P- document publlehed orlor to the International filing date but 
later then the ptorify date delm*d 

IV. CCHTtFICATIQW 

Date of the Actual Comptetidn el Um iMornoUonel Search 

11 July 1991 

International Searching Authority 

EUROPEAN PATENT OFFICE 



later document puWUhed after the huemational fifing dete 
or prkSrdoto end noil* conflict with the wOcjUoft but 
cited to understand the pfindpl* or theory underlying tno 
Invention 

document of particular relevance: the delmed Invention 
cannot be coneldered novel or ceanot be coneidored to 
Involve on Inventive etep 
•Y- document of particular relevance;* the claimed InvtfttJon 
cannot be eonalderod to Involve an Inventive step when the 
document U combined with one or more other euch docu- 
menti, each combination being obvioue to a peraon akitted 
In the art. 

"A* document member of the him patent family 



Data of Mailing of thla tMetnevonel J^erthlleport 



national Searchf 




FRANK 



Form PCT/ISA/Z10 leecond ah*«0 (Jenoerv tf€3) 



-2- 



titurntttorul Application No pQJ /TJg 91/02209 



III. DOCUMENTS CONSIDERED TO BE RELEVANT (CONTINUED FROM THE SECOND SHEET) 



Cataoory * 



Citation of Document, " with Indication, whara aporopftata, of tha rttavant oaaaaoat 



Ralavant to Claim No. 



claim 17. 

EP, Al, 0 388 232 

(CHIRON CORPORATION) 

19 September 1990 
(19.09-90), see page 22, 
line 19 - page 23, line 3. 



Form PCT/tSA 210 (extra sheet) (January 1985) 



WHAN6 

bericht fiber die Internationale 
Patentanneldung Nr. 



ANNEX 

Application No. 

PCTAB 91/02209 WE 46369 



ANNEXE 

au rapport da recherche inter- 
national relati* 4 la deaande de brevet 
international n» 



In diesee finhana sind die MM"** 



Stedin the "tawtS^S^nwi^is 

!fa3 S KKSlrS purpose ^fl^^rWH^i" 

trf information. rj B \ 'QHiee. 



La ortsente «^* n *Sft*2U 
■»£res de la faaille de brevets 



Ie Recherchenberidtt 
angles Patentdokuaent 
Patent docueentcited 
in search report 
Docueent de brevet cite 
danTle rapport de recherdtt 

EP-A1- A 14475 



EP-A1- 388232 



Datufi der 
Ver6«entlichunq 
Publication 
date 
Date de. 
publication 

27-02-91 



i 9-09-90 



Hitaiied(er) der 
Patentfaailie 

Patent faeilv 
oerterte) 
Heibre(s) de la 
faaillc de brevets 

AU-A1-63449/90 
W0-A1- 9102820 



AU-A1- 

CA-AA- 

EP-TD- 

HU-AO- 

HU-A2- 

IL-AO- 

NO-A - 

NO-AO- 

PT-A 

W0-A1 

FI-AO 



52783/90 
2012482 

- 388232 
902814 

54896 
93764 

- 904712 

- 904712 

93480 

- 9011089 

- 905591 



Battia der 
Verdtfentlichuftg 
Publication 
date 

Date de 
publication 

03-04-91 
07-03-91 



22- 10-90 
17-09-90 
02-05-91 

28- 03-91 

29- 04-91 

23- 12-90 

30- 10-90 
30-10-90 
28-09-90 
04-10-90 
12-11-90 



